This sounds like the kind of thing where Raymond Chen would write up a historically completely sensible rationale for why that code is the way it is.
The Microsoft code leak mentioned by one of the comments has been out there for years so might as well paste it here so cut down on some of the speculation? Fair use - commercial value is zero, historical value for analysis and criticism is high.
The relevant code comments seems to be
"Fix timing problem??"
and
"486 bug - must wait till after last "out f0" to clear fp exceptions or IGNNE# will be permanently active."
public __fpIRQ13
__fpIRQ13:
cli
WASTE_TIME 70
push ax
xor al, al
NULL_JMP
out 0f0h, al ; reset busy line.
NULL_JMP
mov al, 65h
NULL_JMP
out 0a0h, al ; EOI slave irq 5
NULL_JMP
mov al, 62h
NULL_JMP
out 20h, al ; EOI master irq 2
NULL_JMP
pop ax
sub sp, 2
push bp
mov bp, sp
fnstsw [bp+2]
WASTE_TIME
push ax
xor al, al
NULL_JMP
out 0f0h, al ; reset busy line.
NULL_JMP
pop ax
pop bp
; fnclex ; 486 bug - must wait till after last
; "out f0" to clear fp exceptions
; or IGNNE# will be permanently active.
WASTE_TIME
push ax
xor al, al
NULL_JMP
out 0f0h, al ; reset busy line.
NULL_JMP
pop ax
; fnclex ; 486 bug - must wait till after last
; "out f0" to clear fp exceptions
; or IGNNE# will be permanently active.
WASTE_TIME
push ax
xor al, al
NULL_JMP
out 0f0h, al ; reset busy line.
NULL_JMP
pop ax
fnclex ;Now this is safe.
WASTE_TIME 70 ;Fix timing problem??
jmp __FPEXCEPTION87P
The only way to have more fun than abstracting broken software is abstracting broken hardware.
I can imagine somebody spent months on those few lines of assembly.
This is what you’d run across in codebases before the internet, let alone Stack Overflow.
People didn’t have code to copy and paste — so they randomly wrote it like monkeys until it worked based their understanding of one page of a manual, which was literally the only documentation or description anywhere of how the system they were working with worked.
Source: I was there :)
FPUs in the early x86 family are weird. They were typically on separate chips so you could have an 8088+8087, 80286+287, 80286+287XL (which was actually a 80387), 80386+387 (SX and DX models for 24 or 32 bit bus), 80386+287[1], 80386 or 486+Weitek[2], 80386+Weitek+387, 80486SX+80487 where the co-processor was a full CPU that disabled the main chip. And then there were the clones doing creative things such as the Nx586+587[3] which because of it's lack of on-board FPU was often confused for a 386 by software and lost the advantage of its Pentium ops.
So I'm not surprised the exception handler is a mess. It's a domain built entirely out of corner-cases.
[1] https://old.reddit.com/r/retrobattlestations/comments/hj12ck...
[2] https://micro.magnet.fsu.edu/optics/olympusmicd/galleries/ch...
It is clearly written to not use the (F)WAIT instruction -- the "dumb" code is there to make sure the previous 80287 instruction has completed.
The first time wasting code is long because it has to be slower than the slowest 287 instruction takes to complete after signaling an error. The other time wasters are shorter because they come after known instructions that are faster (FNSTSW just stores 2 bytes to memory, FNCLEX clears some bits inside the 287). Note also that they are the FNSTSW and FNCLEX -- that means there is no implicit (F)WAIT instruction before the real 287 instruction.
Why two FNCLEX? I don't know.
Why 4 writes to port F0? Probably in case the FNSTSW and FNCLEX instructions lead to errors.
Somewhere there is a production codebase containing a particular sequence of check-ins that reflect the peak of my similar flailings.
I am not proud of my desperation, but I can acknowledge it now.
those opening "wtf" sequences might be there as filler space; harmless instructions with a known pattern where you can come back later and insert different instructions. Most people use NOPs for that but perhaps they wanted a different signature or needed 3 separate, differentiated patch points at entry. Or maybe they wanted to help sell more 8087 chips.
Anybody recall if there was a notable performance difference between Borland's FP emulation lib and M$, then? My habit at the time was to religiously avoid all floats, to the point of shipping a home made arbitrary precision BCD math library. It was no faster than anything else but it gave the same results for the same inputs, every time on every machine.
I've inherited a similar bit of code that kicks in right after pivot (of Linux boot) and tries to disassemble and clean up whatever storage was concocted by the previous steps during boot, and then proceeds to assemble it using some user-supplied layout.
The code is awful, but, really, if anyone's to blame, it's the Linux people who never cared to systematize and unify system's understanding and representation of storage.
Got rabbit holed... I love this ad - https://www.os2museum.com/wp/os2-history/os2-beginnings/1987... - it is a sort of weird mixture of Steve Job's Apple smooth talking and desperate street seller at the same time.
I'm not nearly expert enough to judge, but to me it smells like heavy wizardry.
"Desperation" or random iterations until it passed every test. It doesn't seem to have a lot of opcodes. How much time did it take to find the algorithm with the processing speed of their time?
Somewhat off topic, but your network switches don't still come with metal cases? I get the cheapest stuff that's likely to be reasonably good quality and they all have metal cases.
This triggered my PTSD haha
The removal of "This..." from the title here really confuses it.
With "This", it's obvious the title is "(This code) smells of desperation". The submitted title is ambiguous; it could mean "(Code smells) of desperation".
> But the code in WIN87EM.DLL looks very much like the result of changes made in desperation until it worked somehow, even though the changes made little or no sense.
This is how the characters in Coding Machines realized something was up, assembly instructions involving carry bits that made no sense, that they later realized was how an AI writes code: https://www.teamten.com/lawrence/writings/coding-machines/
> It took us the rest of the afternoon to pick through the convoluted jump targets and decode four consecutive instructions. That snippet, it turns out, was finding the sign of an integer. Anyone else would have done a simple comparison and a jump to set the output register to -1, 0, or 1, but the four instructions were a mess of instructions that all either set the carry bit as a side-effect, or used it in an unorthodox way.