если кому интересно — вот эти пики на 75мб аллокаций вызваны парсингом одного единственного CRL размером меньше 800К с помощью библиотеки x509. Она мне и до этого не особо нравилась, а теперь так и совсем.
At my previous job we had a web crawler that would work correctly until it suddenly exhausts all the memory. We thought it was a space leak (had a lot of them before), so I started digging, but after some time I realized that the crawler was running into a particular HTML that it was unable to parse. That HTML was from a site called something like "Grandma's country house" and they simply did not use any closing tags in their HTML code. Lots of opening tags and no closing ones. A browser would display the site correctly, but tagsoup couldn't handle it.
We fixed the bug by killing the parser by timeout and added a 666.html test case.