Deja Vu #05: CODING: Optimization Techniques for ZX Spectrum Screen Scrolling

SoundTrack: SECTOR/SERIOUS SPECCY GROUP'98
__________________________________________

(C) Kolotov Sergey, SerzhSoft, may, 1998.
__________________________________________

Again OPTIMIZATION

To the Great and Terrible Leader of Samara Hackers M.M.A aka UnBeliever, with a smile.

In the fourth issue of Deja Vu, readers could familiarize themselves with the article by M.M.A dedicated to optimization. Undoubtedly, Maxim raised a very good topic and described everything quite interestingly. I remind you that it was about creating a procedure for fast scrolling the entire screen upwards. The article included the source code of the program in the format of the new cool assembler STORM. Indeed, this is a quite convenient and, most importantly – _very_ fast ASM. But... Personally, I didn't like it. Although this is purely a subjective opinion. I just am not used to "glued" mnemonics, where commands and operands are separated by just one space... But still, the guys from X-TRADE are simply great! I advise everyone to try coding on STORM, it's cool! If DARK and LD would just make normal mnemonics, I would have storm'ed it without hesitation long ago!

Since this assembler has not yet gained enough popularity, surely not everyone has had the chance to familiarize themselves with the source code of the procedure from M.M.A. Therefore, it would be good to print it in a "normal" text format, which we will do now...

;-------------------------------------; ABSOLUTLY COOL LDI SCROLL
; COMPLETELY CODED BY M.M.A
;-------------------------------------
ORG #8000

CALL INSTAL; Building the screen address table

LD HL,#C000,DE,#4000,BC,6144
LDIR ; Transfer to the screen
; loaded from address #C000
; image

;-------------------------------------; Call the procedure to shift
; the screen up one line 192 times
;------------------------------------;
LD B,192
LOO PUSH BC
DI
CALL ONELINE
EI:HALT ;can be removed to achieve uniform "jittering" of lines.

POP BC
DJNZ LOO
RET ; /
;------------------------------------
INSTAL LD HL,#4000; This installer
LD B,192; of addresses can
LOOP2 CALL DOWNHL; be made faster,
PUSH HL; but why?
METK3 LD (TABL),HL;
LD HL,(METK3+1)
INC HL
INC HL
LD (METK3+1),HL
POP HL
DJNZ LOOP2
RET
;-------------------------------------; THE PROCEDURE FOR SHIFTING UP ONE LINE
;-------------------------------------
ONELINE
LD (METK2+1),SP; 20
LD BC,6144-#20; 10
LD HL,#4000; 10
LD (METK1+1),HL; 16
LD HL,TABL; 10
LD (METK4+1),HL; 16

METK4 LD SP,0; 10 (In )
POP HL; 11 ( From )
LD (METK4+1),SP; 20 ( E )
METK1 LD DE,0; 10 ( G )
LD (METK1+1),HL; 16 ( O)
.32 LDI ; 16*32 ( )
JP PE,METK4; 10 ( 578 )

LD SP,#5800; 10
.16 PUSH BC; 11*16
METK2 LD SP,0; 10

RET ; TOTAL 110682 TACTS

; ++++++++++++++++++++++++++++++++++++
; + +
; + "DOWN HL" SUBROUTINES FROM MASM1.1 +
; + +
; ++++++++++++++++++++++++++++++++++++
;
;For speed, all JR jumps to the RET command
;were replaced with RET NZ,RET C

DOWNHL INC H; 4
LD A,H; 4
AND 7; 7
RET NZ; 5/11
LD A,L; 4
ADD A,32;7
LD L,A; 4
RET C; 5/11
LD A,H; 4
SUB 8; 7
LD H,A; 4
RET ; 10 TOTAL 55
TABL

And now a few comments... Upon closer examination, one can notice that the cycles are counted incorrectly in some places... First, the command POP HL takes not 11, but 10 cycles! And secondly, this command seems to not be counted at all! Indeed, in the loop (where it is also written "TOTAL - 578") the total execution time of the commands takes not 578 cycles, as indicated, but a full 588!!! And the total execution time turns out to be significantly more than marked: 82+112308+196=112586 cycles! A whole 2 thousand cycles on top!

So, having familiarized ourselves with the program, let's try to optimize it, or even better - write it "from scratch"! I won't torture you for long with my musings, just watch and figure it out:

;--------------------------------------;
; FULL SCREEN SCROLL UP ;
; coded by Kolotov Sergey ;
; (c) SerzhSoft, Shadrinsk, may, 1998 ;
;--------------------------------------;
_NULL EQU 0
;--------------------------------------;
_DATA EQU #6000
SCR_TBL EQU _DATA ;,#0300
DATAEND EQU SCR_TBL+#0300
;--------------------------------------;
ORG #8000
;--------------------------------------;
MAINPRG
EI
;
CALL MK_STBL
; CALL MKSRLUP
;
LD HL,#0000
LD DE,#4000
LD BC,#1800
LDIR
;
LD B,#C0
LP_MAIN PUSH BC
HALT
DI
CALL SRL_UP
EI
POP BC
DJNZ LP_MAIN
;
RET
;--------------------------------------;
MK_STBL
LD HL,SCR_TBL
LD DE,#4000
LD B,#C0
LP_MSTB LD (HL),E
INC HL
LD (HL),D
INC HL
;
INC D
LD A,D
AND #07
JR NZ,$+12
LD A,E
ADD A,#20
LD E,A
JR C,$+6
LD A,D
SUB #08
LD D,A
;
LD (HL),E
INC HL
LD (HL),D
INC HL
DJNZ LP_MSTB
; RET
;--------------------------------------;
MKSRLUP
LD HL,SRL_UP
LD (HL),#ED ;ld (...),sp
INC HL
LD (HL),#73
INC HL
PUSH HL
INC HL
INC HL
LD (HL),#01 ;ld bc,...
INC HL
LD (HL),#E0
INC HL
LD (HL),#17 ;ld bc,#17E0
INC HL
LD (HL),#31 ;ld sp,...
INC HL
LD DE,SCR_TBL
LD (HL),E
INC HL
LD (HL),D ;ld sp,SCR_TBL
INC HL
PUSH HL ;lp_srup
LD (HL),#D1 ;pop de
INC HL
LD (HL),#E1 ;pop hl
INC HL
LD B,#20
LP_MSU1 LD (HL),#ED ;ldi
INC HL
LD (HL),#A0
INC HL
DJNZ LP_MSU1
LD (HL),#EA ;jp pe,...
INC HL
POP DE
LD (HL),E
INC HL
LD (HL),D ;jp pe,lp_srup
INC HL
LD (HL),#31 ;ld sp,...
INC HL
LD (HL),B ;#00
INC HL
LD (HL),#58 ;ld sp,#5800
INC HL
LD B,#10
LP_MSU2 LD (HL),#C5 ;push bc
INC HL
DJNZ LP_MSU2
LD (HL),#31 ;ld sp,_NULL
INC HL
EX DE,HL
POP HL ;ld (...),sp
LD (HL),E ; ^^^
INC HL
LD (HL),D
EX DE,HL
INC HL
INC HL
LD (HL),#C9 ;ret
RET
;--------------------------------------;
_CODE EQU $
SRL_UP EQU _CODE ;,#0066
CODEEND EQU SRL_UP+#0066
;--------------------------------------;

A few words about the program... As you can see, everything is written in the traditional "realtime - programming" style, a characteristic feature of which is the creation of fast, large-scale procedures in real time. In this case - a fast procedure for shifting the entire screen up one line (SRL_UP) is generated. To do this, the MKSRLUP procedure is called, upon transferring control to which in the "additional code segment" we get something like:

SRL_UP
LD (SP_SRUP+1),SP ;20
LD BC,#17E0 ;10
LD SP,SCR_TBL ;10
;
LP_SRUP POP DE ;10 (TOTAL)
POP HL ;10 ( 542 )
.32 LDI ;16*32 ( * )
JP PE,LP_SRUP ;10 ( 191 )
;
LD SP,#5800 ;10
.16 PUSH BC ;11*16
;
SP_SRUP LD SP,_NULL ;10
RET
;
TOTAL: 40 + 103522 + 196 = 103758 cycles!

The MK_STBL procedure creates a table of screen addresses SCR_TBL in the data segment, occupying 768 bytes. Since loading any screen is not very convenient (and it is also necessary to find a suitable one! :-)), it was decided to copy the standard image present in all computers, located in ROM at address #0000 to the screen. :-) It is called "CHAOS," although some authorities insist on another name, namely "Broken TV"! But never mind. The main thing is that the scrolling effect is very well observed and it is also quite clear - where exactly the screen's sweep beam "overruns" our procedure...

The program occupies only 147 bytes!

If you are ready to sacrifice larger amounts of memory, then the speed can be further increased... Then the program will change slightly:

...
;--------------------------------------;
LNS_NUM EQU #40
;--------------------------------------;
...
;--------------------------------------;
MKSRLUP
LD HL,SRL_UP
LD (HL),#ED ;ld (...),sp
INC HL
LD (HL),#73
INC HL
PUSH HL
INC HL
INC HL
LD (HL),#01 ;ld bc,...
INC HL
LD (HL),#E0
INC HL
LD (HL),#17 ;ld bc,#17E0
INC HL
LD (HL),#31 ;ld sp,...
INC HL
LD DE,SCR_TBL
LD (HL),E
INC HL
LD (HL),D ;ld sp,SCR_TBL
INC HL
LD (HL),#C3 ;jp ...
INC HL
LD D,H
LD E,L
LD BC,#0042+2
ADD HL,BC
EX DE,HL
LD (HL),E
INC HL
LD (HL),D ;jp lp_srup+#0042
INC HL
PUSH HL ;lp_srup
LD C,LNS_NUM
LP_MSU0 LD (HL),#D1 ;pop de
INC HL
LD (HL),#E1 ;pop hl
INC HL
LD B,#20
LP_MSU1 LD (HL),#ED ;ldi
INC HL
LD (HL),#A0
INC HL
DJNZ LP_MSU1
DEC C
JR NZ,LP_MSU0
LD (HL),#EA ;jp pe,...
INC HL
POP DE
LD (HL),E
INC HL
LD (HL),D ;jp pe,lp_srup
INC HL
LD (HL),#31 ;ld sp,...
INC HL
LD (HL),B ;#00
INC HL
LD (HL),#58 ;ld sp,#5800
INC HL
LD B,#10
LP_MSU2 LD (HL),#C5 ;push bc
INC HL
DJNZ LP_MSU2
LD (HL),#31 ;ld sp,_NULL
INC HL
EX DE,HL
POP HL ;ld (...),sp
LD (HL),E ; ^^^
INC HL
LD (HL),D
EX DE,HL
INC HL
INC HL
LD (HL),#C9 ;ret
RET
;--------------------------------------;
LN_SRUP EQU LNS_NUM*#0042+#0027
;--------------------------------------;
_CODE EQU $
SRL_UP EQU _CODE ;,LN_SRUP
CODEEND EQU SRL_UP+LN_SRUP
;--------------------------------------;

As you can see, the MKSRLUP procedure has changed, which now generates a much larger SRL_UP procedure:

SRL_UP
LD (SP_SRUP+1),SP ;20
LD BC,#17E0 ;10
LD SP,SCR_TBL ;10
JP LP_SRUP+#0042 ;10
;
LP_SRUP POP DE ;10
POP HL ;10
.32 LDI ;16*32
...
POP DE ;10 TOTAL
POP HL ;10 LNS_NUM
.32 LDI ;16*32 TIMES
... /
POP DE ;10 /
POP HL ;10 /
.32 LDI ;16*32 /
;
JP PE,LP_SRUP ;10
;
LD SP,#5800 ;10
.16 PUSH BC ;11*16
;
SP_SRUP LD SP,_NULL ;10
RET

The constant LNS_NUM determines the number of screen lines that are copied in one iteration of the loop. Thus, with LNS_NUM = 64(#40), the total execution time of the procedure is calculated as follows:

50 + ((532*64+10)*(192/64)-532) + 196 =
= 246 + 101642 = 101888 cycles.

Thus, the gain amounted to 1870 cycles, which, of course, is not that much. Rather, from this you should understand the principles of program optimization by execution time and use them where needed and where not!

Finally, I would like to note that in any case, for transferring large amounts of data, programs that work with the stack perform the fastest. Perhaps in the next article we will talk about this. For now, try to figure it out yourself... Good luck!

With best wishes
Serzh.

Contents of the publication: Deja Vu #05

  • Аперативчик - Max
    Detailed instructions on managing the DEJA VU interface, highlighting different input methods and navigation commands. Explanation of the new and old interfaces for enhanced user experience. Discussion on additional features like frame scrolling and music management.
  • Аперативчик - Max
    Discussion on supporting machines with more than 128k memory, leading to separate shells for 128k and 256k systems. Testing was mainly done on Scorpion and Profi, with functionality on other models anticipated. Article includes guidance on unpacking source files and insights on using improved algorithms.
  • Тема - M.M.A
    This article explores the theory behind digitizing sound on ZX Spectrum, focusing on sampling and quantization processes. It provides practical insights into converting sound files using specific hardware and software. Additionally, it offers methods to enhance sound quality while working within the hardware limitations.
  • Theme
    The article discusses the Save Our Scene initiative aimed at uniting Spectrum users and developers to promote software distribution and enhance the scene's development.
  • Charter of the Amazing Soft Making Association
    Discussion of the founding charter of the Amazing Soft Making association, detailing its goals, membership criteria, and operational principles.
  • Theory of Magazine Creation
    The article provides a detailed guide for aspiring magazine creators, focusing on technical aspects such as interface design, memory management, text formatting, and music integration for ZX Spectrum publications.
  • Solder Drop
    The article provides a personal account of purchasing and using the General Sound device for ZX Spectrum, detailing installation and sound performance. It discusses the initial issues encountered and praises the enhanced audio experience in compatible games. The author encourages further software adaptation for the device and reflects on multimedia capabilities with simultaneous hardware use.
  • Solder Drop
    The article discusses the capabilities of Sound Forge 4.0c for professional audio processing on PCs, highlighting its extensive features such as sound editing, effects, and restoration tools.
  • SOFTWARE
    The article reviews the latest software developments for the ZX Spectrum from Samara, including updates to MAXSOFT SCREEN PACKER, File Commander, and new applications like S-Terminal.
  • SOFTWARE - Card!nal
    Review and walkthrough of the logical graphic adventure game 'Operation R.R.' with detailed level instructions. Discussion on game elements like music choice and graphic design. Mentions new coder MAX/CYBERAX/BINARY DIMENSION's involvement.
  • SOFTWARE
    Discussion on the current state and evolution of the demoscene, highlighting the rise of 4K intros and upcoming competitions like FUNTOP'98.
  • CODING
    Article discusses assembly language coding techniques for optimizing screen scrolling on ZX Spectrum, featuring example code and performance analysis.
  • CODING - RLA
    The article explores stack manipulation techniques during second type interrupts for graphical effects on ZX Spectrum. It discusses solutions for preserving data integrity when interrupts disrupt graphical operations. Practical examples are provided to handle stack issues efficiently.
  • CODING
    The article describes the MS-PACK packer and its DEPACKER, detailing usage scenarios and providing BASIC and assembly code examples for handling packed files. It emphasizes optimizing performance by allowing unpacking with interrupts enabled and separating the DEPACKER from packed files. Additionally, it includes insights on programming techniques for loading and executing BASIC files on ZX Spectrum.
  • CODING
    The article discusses various coding techniques for ZX Spectrum, focusing on sprite rendering, rotation algorithms, and optimization methods to enhance performance.
  • ANOTHER WORLD
    Discussion on the evolution of multimedia technologies and their impact on various fields, including education and entertainment. It covers advances in computer hardware and software that have facilitated the integration of audio, video, and text. The article reflects on past developments and speculates on the future of multimedia systems.
  • ANOTHER WORLD
    Comparison of PC and Amiga systems highlighting performance, software costs, and user experience with multimedia capabilities.
  • Honor Roll
    Interview with PROGRESS discusses their creative journey on ZX Spectrum and AMIGA, addressing challenges in demomaking and the current state of the scene.
  • Honor Roll
    The article details the activities and future projects of the Eternity Industry team, based in Kovrov, including successful releases and collaborations with other groups.
  • Honor Roll
    Discussion of the Artcomp'98 festival, focusing on its mail-in format and guidelines for various competitions, including demo, graphics, and music categories.
  • Honor Roll
    The article provides a glossary of terms used in the demo scene, explaining roles such as musician, coder, and graphician, as well as different types of demos and effects. It serves as a useful resource for understanding the terminology and dynamics of the community. This is a descriptive piece aimed at educating readers about the jargon of the demo scene.
  • Honor Roll
    The article discusses the issues with mouse support in various ZX Spectrum magazines and the frustrations of users when encountering compatibility problems. It critiques developers for not adhering to standards, leading to poor user experiences. The author expresses the importance of consistent improvements in software for the ZX Spectrum community.
  • Honor Board
    The article discusses the process of creating tricolor images for ZX Spectrum using Photoshop and a simplified approach. It outlines how to divide an image into RGB channels and convert them for use on the Spectrum. Additionally, it provides tips on how to manage the files for optimal results.
  • Honor Roll
    The article discusses the comparison and perspectives on various computer systems, particularly emphasizing the strengths of AMIGA over PC and advocating for appreciation of all machines.
  • Seven and a Half
    Article discusses the humorous absurdities and peculiarities of military training and academia, blending satire with real anecdotes and witty observations.
  • Seven and a Half
    The article provides a satirical manual on programming methodologies, mocking the rigidity of formal programming practices and advocating for a more creative approach to coding.
  • Seven and a Half
    Instructions on safe sex practices, including guidelines on eligibility, preparation, and actions during and after the sexual session, along with handling emergency situations.
  • Seven and a Half
    The article discusses a call for a talented artist in Krasnodar for a ZX Spectrum group, raises concerns about the unethical practices of Scorpion regarding software rights, and critiques a video review of E'97.
  • Seven and a Half
    The article 'Семь и 1/2' narrates a humorous picnic adventure involving the editorial team of Deja Vu, highlighting their camaraderie and mishaps while preparing a barbecue.
  • Trial of the Pen
    The article is a humorous take on the fictional adventures of Winnie the Pooh as he interacts with computers and friends, discussing the absurdities of technology and daily life.
  • First Pen
    The article discusses the new section in Deja Vu dedicated to fantasy and science fiction literature, featuring book reviews and reader participation in content creation.
  • Advertisement
    The article is an advertisement section from Deja Vu #05, promoting collaborations with designers and musicians for future issues, and offering various software and hardware for ZX Spectrum.
  • News
    The article announces the launch of a new magazine, AMIGA RULES, focused on the AMIGA computer, addressing the lack of quality Russian-language publications. It aims to provide information on programming, hardware, software, and gaming, while fostering a community among AMIGA enthusiasts. The magazine will include contributions from readers and regular updates on the AMIGA scene.