So I nearly pulled out all my hair today trying to figure out the most insane and bizarre problem. I thought my OS had gotten fucked up in some way, even worried that I had some kind of infection. There was however, nothing wrong, except my stupidity.
So it began when suddenly, and for no reason I could determine a piece of software I was working on suddenly got 10 times slower. It was a development tool I have produced which I make extensive use of and being 10 times slower was instantly noticed. At first I thought there was some kind of bug introduced by changing compiler flags which recently I had done. After changing them all back and recompiling everything there was no change.
At this point I was thoroughly confused, maybe the input data had changed somehow making the tool take more time to run, but even though the tool has sloppy n² algorithms 10 times slower is still a huge change when I did not recall changing anything at all. At some point I used my sampling profiler to determine where in the tool the slowdown occurred, that place was the symbol lookup function, it uses a crappy n² search array which was taking up all the time. That function performed some 40 million string comparisons during the execution of the program, that is quite allot, but it was taking 3000ms to run which is terrible.
At this point in desperation I pulled out an old binary and tried, that. The problem still persisted. I then coppied the code to another computer and ran the current version and there was no problem. I though maybe my cpu had down-clocked or broke in some way so I ran the tool in VM on same machine and it worked fine. I restarted in safe mode and there the tool ran at normal speed. At this point I was getting worried. What the fuck had happened to my computer as to make my tool run slow. Had my OS broken in some bizarre way as to cause slowdown to piece of code which called no function called. Did I have some kind of infection. I was about ready to reinstall my whole computer, which would have been a serious ordeal.
Before reinstalling my computer I decided to do a few more experiments, I wanted to narrow down the exact cause of the slowness. I was thinking that some sequence of instructions was fucking up the cpu or maybe there was some problem with memory or something. Maybe there where some silent exceptions being thrown somewhere. Anyway after a little more experimentation a thought entered my brain, page heap.
The fucking page heap, I had recently enabled the page heap when trying to solve a bug in my tool, it turned out the bug was not memory corruption but I though not to disable the page heap because it had never caused me issues before. I was so wrong to leave it enabled.
What does the page heap do?
The page heap alters the behavior of the memory allocator such that every allocation is given its own page. This is horrific in terms of memory usage but makes dealing with heap corruption much easier as now writing beyond the allocation can be detected by a page fault. Its not immediately obvious but this will also have a horrific effect on performance in some situations.
The cache is arranged in such a way that only a certain number of items can share the same lower address bits. This type of cache is called Set Associative Cache. My cpu and pretty much every modern cpu has a cache which operates in this fashion, this limitation in not too bad in the normal case. However with page heap enabled pretty much all the small allocations end up with the same lower address bits completely filling up all usable slots in the cache resulting in thrashing and the massive 10x slowdown.
So in summary, always be aware of how the page heap can fuck you up. You shall disable it as soon as it is no-longer needed. It can come back to bite you later when you have forgotten that you had enabled it. But since you have read this warning you will now know where to look if your code gets slow for no reasons, you will not have to face the terror that I had this day.