Turbination of Spectrum machines.
Purpose: elimination of ignorance.
(C) Nemo
________________________________
"... This aspiration can manifest itself in different forms and with varying intensity - depending on how enlightened and restrained this scorching thirst for human self-deprecation is, breaking through from the subconscious and dark depths..."
Fr. G.V. Florovsky, "Byzantine Fathers V-VIII" from readings at the Orthodox Theological Institute in Paris. Paris, 1993, p.35.
Means: the magic of words.
"... Words got tangled and doubled and carried thought along - words have their own magic and power..."
Ibid, p.7.
"... Here, it is not so much the individual words and sentences that matter, but the very style and internal tendency of thought..."
Ibid, p.9.
Rights: Reprinting and citation are permitted only in the author's edition with reference to the company (C) Nemo.
_______________
Terminology.
To begin with, it is necessary to define the range of related concepts required for unambiguous, confusion-free perception of the subsequent exposition.
Def. 1: The computational power of a PC - an integral (composite) criterion that states the practical speed (labor productivity) of the user's work on this PC.
This concept is conveniently considered using the example of the IBM PC, which has the most developed, flexible, and diverse configuration, as computational power primarily depends on configuration and performance. The flexibility of the processor and the scalability of the configuration are provided by the embedded software. First of all, this is DOS, and within it, BIOS (Base Input / Output System). By modifying the BIOS, one can obtain various hardware configurations that have adequate software support. The Spectrum, in this regard, is exceptionally conservative, as TR-DOS is not literally a DOS or an OS. Historically, the system was conceived and implemented as a device for quick loading of programs for the hardware add-on to Sir Clive Sinclair's "rubber", which predetermined the further difficulties with the hardware development of the Spectrum. IS-DOS, in this respect, is favorably distinguished and comparable to TR-DOS, like a divine gift compared to a fried egg.
Let us not forget about the proportionality of the powers applied in the PC subsystems. Significant, disproportionate enhancement of the powers of individual subsystems, if it leads to anything, results only in a slight increase in computational power. Specific optimal power ratios depend primarily on the range of tasks solved by the user and will differ, for example, for databases and graphic stations.
Imagine that we have installed 32MB of RAM on the Spectrum (Calm down, dear readers, this thought did not come to the author's mind, see "Radio Amateur" N4 for 94, "Personal Computer Eric", p. 9.). Can you imagine it? The author envisions it as an engine from a "Boeing" installed on a "Zaporozhets". Funny, isn't it? Why did such a farce occur? The Z80 has an eight-bit data bus, 4 MHz clock speed, and 64k of directly accessible memory; segmentation mechanisms will be complex, inefficient, and non-standard. The amount of information in 32MB, if it is found, will turn into a pile of junk, just like the "Zaporozhets", if there is no DOS for access and working with such volumes. Unfortunately, all of the above can also be attributed to the hard drive - even if you agree to work under the IS-DOS system, the feasibility of its installation remains in question.
To close Def. 1, it is perhaps necessary to clarify what serves as a measure of computational power. This is a test program that shows at what frequency the processor of a certain base machine would need to be clocked to match the computational power of the tested sample. This is exactly how it is done in IBM computers, which produce a nice number in MHz, which is mistakenly confused with the physical frequency of the actual clock signal.
If we assume that the architecture is unchanged, which is quite close to the truth for the Spectrum, then computational power depends only on the performance of the processor and is directly proportional to it. Therefore - Def. 2:
Def. 2: The performance of the processor (computer) is the number of short commands (for example, register-register) executed by the processor in a unit of time. The dimension therefore looks like [op./sec].
This parameter is important not only for solving unreal academic tasks in unreal time. An example of such a task is counting the number of lucky tickets in a roll of bus ticket offices that once existed. This task is unique for three reasons. There is no other way to solve it than brute force; the program is exceptionally compact; solving it requires machine time in horse doses.
Much more important for the Spectrum, as a predominantly gaming computer, are the growing potential capabilities for complicating graphics, increasing the number of graphical objects (sprites) on the screen without reducing their dynamics, refining the entourage (game background, second plan), and, undoubtedly, already existing games. In games that use iterative calculations (simulators), objects undergo smoother evolutions and are less convulsive.
Stepping back a bit, other (architectural) ways to improve graphics quality can also be cited. In Dendy, for example, with a relatively weak processor and small RAM, high-quality entourage is achieved through direct memory access (DMA) techniques, however, the overall static nature of the picture and the poverty of the gaming situation clearly reveal the methods of their technical implementation.
In general, Motorola's gadgets hinge on DMA techniques. One of the latest Amiga computers has 27 DMA channels and is therefore very convenient, for example, for creating TV commercials.
So, the usefulness of increasing performance is understood by us. Thus, Def. 3:
Def. 3: The turbination coefficient is a relative value that shows how many times (by what percentage) the performance has changed compared to the standard.
Def. 4: The standard is the basic sample of the PC, the performance of which is taken as one unit. Usually, the same PC before turbination is chosen as the standard. The drawback of such a choice is that the performance of different brands of machines in normal (ordinary) mode is already different. As a result, there is uncertainty when comparing turbined machines in terms of speed (performance).
Comment: The definitions presented in the prologue of the article may seem tedious to some, so I will provide examples of incorrect interpretations that are encountered everywhere.
A turbined disk controller is a disk controller in which the positioning signal period of the heads is reduced by 1.75:2 times. It has no relation to the performance of the actual PC; however, computational power, albeit insignificantly, increases, as the access time to the disk is reduced.
The turbination coefficient is often confused with the ratio of clock frequencies of the computer before and after turbination. As a rule, the ratio of clock frequencies in all Spectrums is equal to two; however, the turbination coefficient can range from 1.25:2 (25:100%).
The turbination coefficient is usually different for ROM and RAM: in ROM it is higher and approximately equal to two.
Calculated ratios.
The turbination coefficient is not an absolute value, as follows from the definition, so we investigate what it depends on.
┌───────────────────┐
│ F Nту + Nwб│
│Kт = ── * ─────────│ (1),
│ Fб Nту + Nw │
└───────────────────┘
where:
Kт - turbination coefficient;
F - clock frequency of the processor in turbo mode (usually 7 MHz);
Fб - clock frequency of the processor of the base variant;
Nту - total number of processor cycles based on the technical conditions (TU) for Z80 necessary to execute the test fragment of the program;
Nw and Nwб - number of wait cycles created during the test fragment run in the tested and base samples, respectively.
Analysis of the formula. There is no exact equality in the formula. This is explained by the fact that F may change during operation. For example, F usually decreases when working with input-output ports; implementing the required delay using WAIT turns out to be costly in terms of circuit design. Nw may depend on the phase of the video processor.
The turbination coefficient Kт indeed depends on the ratio of clock frequencies. However, the desire to excessively increase F is nothing more than a temptation. Increasing F, starting from a certain point, leads to a sharp increase in Nw. Physically, this is explained as follows. Each machine cycle contains a memory access, and memory has a certain number of time intervals (cycles of access to RAM; it is assumed that the machine has a non-transparent video processor), during which access of the processor to RAM is possible (data exchange between the processor and RAM is possible). If the processor "matures" and wants to obtain data earlier than the next window arrives, it receives a WAIT from the arbiter and gets stuck.
Multiplying Kт by Fб, we obtain an effective clock frequency Fэфф, which, unlike F, truly characterizes performance in relation to the base sample or base mode.
┌──────────────┐
│Fэфф = Fб * Kт│ (2)
└──────────────┘
In conclusion, it should be noted that formula (1), having undeniable advantages in visibility, is not functional. Formulas, in general, are written for theoretical analysis and synthesis, that is, they imply some useful work with them. However, determining Nw without special tricks is impossible. This is either a titanic process of analyzing timing diagrams or a painstaking task of creating a special hardware "mouse trap" for "left" active WAITs with a counter for Nw. The task of determining Kт (or Fэфф) is solved using test programs.
...... Methods of turbination. All methods of turbination essentially boil down to a more rational and economical use of that limited number of windows that are formed when working with RAM. Skillfully manipulating the clock frequency (CLC) and WAIT, one must make the processor grab not just anything, but specifically what lies in RAM, that is, synchronize the moment of latching (or the truth of data during output) of the processor with the readiness of RAM. Usually, this function is performed by the arbiter. It receives various information about what the processor is doing or intends to do, and the phase in which the RAM and video processor's timing window are located. The arbiter's function includes analyzing incoming information and controlling the WAIT line. By timely interrupting the processor, the arbiter prevents it from grabbing any garbage from the bus. Usually, the arbiter is implemented as a tree (multi-layer logic) of prohibitions, for example, based on the IC PALR8, which also provides the necessary degree of signal synchronization, that is, a synchronous digital automaton, as indicated by the index R (Register). The tree becomes sprawling, as it is necessary to anticipate all situations when the processor needs to be slowed down. This is an example of apophatic turbination, that is, when the truth of WAIT is determined as a piecewise negation of a number of unsuitable situations. There is also another method, applied in the KAY-256 TURBO computer from (C) Nemo, based on phase auto-tuning of processor cycles to RAM windows. This method can be called dynamic modification of machine cycles (DMMM). Interestingly, the implementation of DMMM does not require additional hardware costs, only a slight redistribution of functions in the already existing circuits. There is no arbiter - in DMMM the arbiter degenerates into a simple zero sensor - synchronizer of the FAP system and exists only virtually. No arbiter, no tree, no costs. The DMMM method is simpler and at the same time the most effective method of turbination of Spectrum machines. However, this holy simplicity is quite deceptive, as practical implementation of DMMM requires a high level of circuit design and, first of all, a high degree of synchronization of the Spectrum subsystems. Therefore, modifying existing Spectrums, if possible, is unreasonably labor-intensive. It is easier to recreate a quarter of the computer's circuitry based on PAL16-R8. Indeed, Again, gentlemen, Again. The simplicity of DMMM translates not only into beautiful circuit design but also into convenient timing analysis, and into something more substantial, which will be discussed below. Indeed, with DMMM, the execution time of various cycles differs significantly from that prescribed by the Z80 specification, but the number of clock cycles in the cycle turns out to be a multiple of 4, and it is easy to recalculate. The conversion of clock cycle counts from passport (according to TU for Z80) to DMMM is carried out according to the following algorithm:
TU ----> DMMM
┌─
│ 3 ----> 4
│ 4 ----> 4 (3)
│=>5 ----> 8
└─
Once a table of new clock lengths of commands is compiled, one can return to the traditional command-by-command counting of execution time. The algorithm is sufficient for describing DMMM, so a reference to such an algorithm indicates the use of the DMMM method, regardless of the brand of the computer. Such an approach to timing analysis makes counting WAITs (Nw) unnecessary. It can be noted that the processor's performance is maximal and reaches its theoretical limit for short (simplest one-byte) commands (which is very much in line with Def. 2) - for example, types like LD r,r', ADD A,r,NOP, and so on. This somewhat resembles the RISC technology of IBM computers. The DMMM method practically fully realizes the potentially available time resources for accessing RAM, productively using almost all provided windows and gives close to the maximum Kт and Fэфф for computers with a non-transparent video processor. One can increase Kт to the theoretical limit by additionally overclocking the processor on the BORDER; in this case, the transformation (3) will look as follows:
TU ----> DMMM
┌──
│ 3 ----> 4
│┌─ 4 ----> 4 (4)
││=>5 ----> 8 BORDER
││ 5.6----> 6 /BORDER
│└─ └──
However, the most valuable aspect of the DMMM method is the linearity over time in the address space and in the phase of the video processor. When transforming (4), the property of linearity over time in the phase of the video processor will be lost, which, in the author's opinion, turns a slight gain in Kт (5 - 7%) into a loss in the complex characteristics of the system. As an example, it is appropriate to consider the following passage from the user manual of the branded Spectrum+3 (p.189). "RAM banks come in two types: combined, which are RAM pages 4 to 7 (sharing time with the video processor) and exclusive from 0 to 3 (which are used solely by the processor). Any machine code programs with critical execution time (such as music or related "communications" programs) should be placed in exclusive banks. For example, a sequence of NOPs (this is nothing more than a reference to the test program - author), located in combined banks, gives an effective frequency (Fэфф - author) of 2.66 MHz against the normal 3.55 MHz (Fб - author). This results in a speed reduction (dKт, in this case, it is negative - author) of about 25%." For the KAY-256 computer, this limitation does not exist. Interestingly, apophatic turbination, in the limit, gives a similar transformation (3) or (4) at incomparably higher hardware costs and complexity of implementation. This is explained by the order of succession and the sequence of availability windows of RAM, which are similar in any machines with a non-transparent video processor. In conclusion about turbination, it should be mentioned that turbination mechanisms can be disabled when accessing TR-DOS. In KAY-256, this had to be done to maintain compatibility on the bus of the base disk controller. Turbo mode control can be done either programmatically (OUT to the port), systemically (OK (open collector) to logical "0" on the system bus (line *TURBO)), or with a switch on the front panel. To disable turbo mode, at least one prohibition is sufficient; to activate it, all three permissions (to pass) are required. Upon RE-SET, the motherboard switches to turbo mode. Example of cycle count recalculation. Command Cycles TU DMMM NOP Ц1(OCR)4 ---> 4 LD(nn),HL Ц1(OCR)4 ---> 4 Ц2(ORL)3 ---> 4 Ц3(ORH)3 ---> 4 Ц4(MWL)3 ---> 4 Ц5(MWH)3 ---> 4 ─────────────── Total:20 --->24 Methods of testing (measuring) Kт. As follows from the definition (Def. 3; formula (1)), Kт depends on Nwб and Nw, which, in turn, depend on the parameters of the test fragment of the program. For DMMM, in particular, it depends on the ratio of the number of short commands to long ones. For apophatic turbination, the connection is not so transparent, but it also exists. One can go even further and state that for each model of turbo-Spectrum there is its own original test that gives the maximum value of Kт compared to other tests. Special test programs typically use a hardware timer - INT signal, the period of which is set quite accurately by the video processor. The test fragment is placed in the form of a loop with a counter between two (or more) consecutive INTs. The value of the counter after running the test fragment is then normalized in the spirit of formula (1). Normalization is carried out according to the selected base sample. Next, the result is displayed on the screen. One must be well aware that Kт depends on the specific implementation of the test fragment and the sample chosen for normalization, that is, the test and sample essentially set the measure of Kт. The Kт coefficient without reference to the test and base sample indicates prices for firewood in Australia. The INT test averages the Kт value over the video processor cycle, so it does not provide real information on non-linearity ("deflation" of processor speed). Below are the results of testing turbo-Spectrums with the "ZX-BENCHMARK TEST", which you can find in the "Appendix" section of our magazine. (This program was first published in "Radio Amateur", N7 for 94.) and the border test.
┌──────╥───────────────╥───────┐
│ ║ Kт in % for ║ test by│
│ Computer║ ZX-BENCHMARK ║ border │
│ ║ RAM│ ROM║ RAM│ ROM║ RAM │
├──────╫───┼───╫───┼───╫───────┤
│ KAY256║ 93│ 95║100│100║ 100 │
│ ║ │ ║ │ ║ │
│ KAY256║170│195║182│205║ 174 │
│ turbo ║ │ ║ │ ║ │
├──────╫───┼───╫───┼───╫───────┤
│ Penta-║ │ ║ │ ║ │
│ gon128║100│100║108│105║ data │
│ Penta-║ │ ║ │ ║ │
│ gon128║160│ d/o║172│ d/o║ from- │
│ turbo ║ │ ║ │ ║ │
├──────╫───┼───╫───┼───╫───────┤
│ Scorp.║ 93│ 95║100│100║ 100 │
│ ║ │ ║ │ ║ │
│ Scorp.║145│195║156│205║ 178 │
│ turbo ║ │ ║ │ ║ │
└──────╨───┴───╨───┴───╨───────┘
In this table, any machine that has 100% performance in RAM can be taken as the base sample against which Kт is measured. Analyzing the table, one can notice that the accuracy of testing is about 5% (it is not entirely clear how, with a twofold increase in clock frequency, the speed of operation in ROM increased by 105%. Also, as expected, the value of Kт depends on the test, and different tests can show differences in machine speeds of different signs.
The impact of turbination on software compatibility.
The software compatibility of machines in the Spectrum family seems to be perceived by users as an inferiority complex. Users go to great lengths to chase elusive fractions of a percent of this very software compatibility: they install port #FF, route the address of port A12 to the sound processor, and create the most precise address decoders; the amount of nonsense is countless.
So, gentlemen, I must disappoint you, the performance of the computer also affects, albeit very weakly, software compatibility. The author finds the situation amusing. When turbination occurs, the losses in software compatibility turn out to be even greater than those unfortunate fractions of a percent, and therefore they lose meaning, equating turbo machines with those gadgets and without them. The user is faced with the task of choosing between the real characteristics of the system and the fixed idea of 100% software compatibility. Of course, one can sit on two chairs, having a performance very close to 100% with normal speed and the option to turn on turbo mode. In essence, such a solution fundamentally changes nothing. For example, the Pentagon, as we have already established, is a poorly turbined machine in relation to the branded Spectrum. If a time-critical section of a program is written on the Pentagon and crammed in without a gap between INTs, then we will obtain software incompatibility with the rest of the Spectrum community. By the way, the poor modem connection of the Pentagon type "VICOMM" with other Spectrums is explained precisely by the non-standard working speed (performance) of the former. Perhaps the solution lies in creating adjustment subprograms that compensate for the differences in performance of different brands of machines, primarily turbined ones. Stepping back a bit, the author wants to state that he considers it flawed to remove the legs of chips in computers that for some reason do not run Pentagon programs. There may be more such programs than legs in the computer. Of course, creating incompatible programs on the Pentagon is not the fault of their owners. It just so happened that the Pentagon turned out to be the first mass version of the Spectrum with a disk controller, and its owners accumulated the most programming experience. However, it is still worth considering that the Pentagon is not a benchmark and not the most mass machine at the moment. The activities of programmers now resemble the shooting of a half-drunk crew of the "GRAD" installation, which compensates for its inability to aim well with the massiveness of the barrage, preferring carpet bombing to precise transmissions. Non-linearity over time in the phase of the video processor is very significant for programs operating in real time. This includes, for example, voice synthesis, music; Fourier transformations; functional sequence generators, related data exchange programs. For such programs, the requirement for linearity is essential; simply put, their creation is impossible with non-linear turbination. As a thought exercise, try to imagine a voice synthesized by direct coding on a computer whose processor is overclocked on the BORDER and slows down when scanning video memory. In the author's opinion, this will be a rattling duet along with the harmonics of frame scanning. The ability to create real-time programs for the Spectrum is still no more than a potential, as such a possibility did not exist before. The smallest, further non-quantifiable, time segment was the period of the system timer INT (about 20 ms); now time in the Spectrum can be set with an accuracy of a machine cycle (units of µs).
Contents of the publication: ZX Format #02
- IS-DOS
Announcement of a school-ready hardware-software complex by Iskra Soft and Peters, featuring a networked computer class setup on ZX Spectrum with IS-DOS.
- IS-DOS - Владимир Елисеев
Explanation of command line monitor and text editor restart in IS-DOS using mon.com utility as an example.
- IS-DOS
Introduction to IS-DOS system utilities, covering functions like help, user menu, file viewing, editing, and file operations.
- IS-DOS
Detailed description of the eliminat.com program for freeing memory from resident tasks and drivers, featuring interactive mode and command-line keys. Specific channel numbers allocated for task and driver types. Includes usage options and color customization.
- IS-DOS Window System - Владимир Елисеев
Exploration of IS-DOS window system restarts for printing text in windows and absolute screen coordinates. Examples of restart implementations such as lwt, adrwt, lenwt, prstr, str, and lnstr. Continuation to cover auxiliary restarts in next issue.
- Assembler
Introduction to assembly language basics, focusing on flags, arithmetic operations, and register manipulation. Discusses addition, subtraction, and complex operations like multiplication and division through examples. Highlights specific assembly commands and their functions for ZX Spectrum.
- Hardware
Discussion of hardware modifications for Scorpion ZS-256-Turbo, including the implementation of a Turbo/Normal switch. Pros and cons of software-based switching methods. Advice on soldering and circuit adjustments.
- Hardware
Discussion of a new music add-on for ZX Spectrum by X-TRADE and HACKER STINGER, called 'ZX GENERAL SOUND', offering high-quality audio and minimal processor usage.
- Hardware
Discussion on hardware innovations and marketing strategies, focusing on mouse and keyboard interfaces for ZX Spectrum. Critique of competitor's misleading advertising and analysis of serial versus passive mouse technology. Author questions necessity and cost of advanced features.
- Hardware
Discussion of hardware acceleration methods for ZX Spectrum, focusing on turbo-modes and their effects on performance and compatibility.
- Toys
Fantasy tale about magical creatures battling human intrusion. Main character recruits allies for a quest to restore the land's former glory. Challenges include finding tools, overcoming obstacles, and reviving companions.
- Toys
Review of 'Carrier Command' game, focusing on its strategic and simulation aspects. Includes detailed gameplay mechanics, controls, and objectives. Highlights strategies for success and unique features like managing resources and autonomous systems.
- Interview
Interview with Sergey Zonov and Andrey Larchenko discussing their experience with microprocessors and ZX Spectrum development, including the creation of the Scorpion ZS 256 computer.
- Information
Contact information and staff list of ZX-Format No. 2 (1995) including editor, coders, and designers.
- Information
Editor's address to readers of ZX Format, discussing the positive feedback, past errors, and the quest for a cartoonist, with future plans for the magazine.
- Information
Discussion on the new interpretation of the 'PULLDOWN' window menu system for ZX Spectrum, focusing on interface updates and user interaction enhancements.
- Information
Discussion about companies illegally distributing ZX Format, emphasizing the benefits of purchasing official copies.
- Competition
The article discusses the lack of participation in ZX Format's competition, detailing the rules and prizes, and encourages readers to engage with new ideas.
- Let's Relax
A humorous account of a software vendor's challenges dealing with clueless customers, illustrating the nerve-wracking nature of his job.
- Mailbox
Overview of hardware components and prices for ZX Spectrum enthusiasts with ordering details.
- Mailbox
Reader letters section in ZX Format #02 discusses reader feedback, addresses issues with ZX Format features, and offers future improvements.
- Premiere
Guide to Digital Studio v1.12, a music editor for ZX Spectrum, including features, menu navigation, and the use of Digital Studio Compiler.
- For Programmers
Exploration of tools that extend the standard Basic 48, including Renumber for Basic 128, Trace & Speed, Blast Toolkit, and ZXeditor, highlighting their functionalities and utilities.
- Various
History of Amiga computer models and their evolution from A1000 to A4000/60T with specifications and unique features. Explanation of technical terms and differences between chip and fast memory. Mention of new developments like AGA chipset and models for different needs.
- Miscellaneous
The article presents upcoming ZX Spectrum software releases and reviews game innovations like 'Adventures of Winnie the Pooh' and 'UFO 2: Devils of the Abyss'. It highlights features, creators, and technical requirements. It also includes announcements from SOFTLAND and Cracked Masters Group.
- Systems
Discussion of creating music with Instrument 3.01, focusing on digitized sound. Analysis of program's capabilities and conversion from ASC Sound Master. Instructions for composition and conversion.
- What's New
Review of new ZX Spectrum games entering the St. Petersburg market in late 1995. Detailed game descriptions, memory requirements, controls, and music/graphics evaluations. Highlights include Night Hunter, Extreme, Grell & Falla, and more.