Stack-buffer-overflow in AffixMgr::compound_check_morph · Issue #996 · hunspell/hunspell · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack-buffer-overflow in AffixMgr::compound_check_morph #996

Open
jonathanmetzman opened this issue Feb 1, 2024 · 2 comments
Open

Stack-buffer-overflow in AffixMgr::compound_check_morph #996

jonathanmetzman opened this issue Feb 1, 2024 · 2 comments

Comments

@jonathanmetzman
Copy link

jonathanmetzman commented Feb 1, 2024

This was found using this fuzz target:

#include <fstream>
#include <hunspell/hunspell.hxx>
#include <sstream>

extern "C" {
#include <hunspell/hunspell.h>
}

extern "C" int LLVMFuzzerTestOneInput(const char *data, size_t size) {
  if (size < 1)
    return 0;
  // use first byte as len of following word to feed into the spell checking
  int wordlen = data[0];
  ++data;
  --size;
  if (wordlen > size)
    return 0;

  std::ofstream wrd("/tmp/test.word", std::ios_base::out | std::ios_base::trunc | std::ios_base::binary);
  wrd.write(data, wordlen);
  wrd.close();

  std::string word(data, wordlen);
  data += wordlen;
  size -= wordlen;

  // take the rest and split into into two, to make aff and dic
  size_t affsize = size / 2;
  std::ofstream aff("/tmp/test.aff", std::ios_base::out | std::ios_base::trunc | std::ios_base::binary);
  aff.write(data, affsize);
  aff.close();

  size_t dicsize = size - affsize;
  std::ofstream dic("/tmp/test.dic", std::ios_base::out | std::ios_base::trunc | std::ios_base::binary);
  dic.write(data + affsize, dicsize);
  dic.close();

  Hunhandle *hndl = Hunspell_create("/tmp/test.aff", "/tmp/test.dic");

  char **slst;

  int ret = Hunspell_generate(hndl, &slst, (char *)word.c_str(), (char *)"?");

  for (int i = 0; i < ret; i++) {
    free(slst[i]);
  }
  free(slst);

  Hunspell_destroy(hndl);

  return 0;
}

If we pass it this POC, we can get a crash minimized-from-65ce03ed5fefe3091a8a4c985800dec5e25af608.txt

Running: ./minimized-from-65ce03ed5fefe3091a8a4c985800dec5e25af608
=================================================================
==2467705==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7f2960901060 at pc 0x559b645f1547 bp 0x7ffd9e872070 sp 0x7ffd9e872068
READ of size 1 at 0x7f2960901060 thread T0
    #0 0x559b645f1546 in AffixMgr::compound_check_morph(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, short, short, short, short, hentry**, hentry**, char, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const*) (/home/user/hunspell/out/fuzzer+0x1dc546) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)
    #1 0x559b645b2d42 in SuggestMgr::suggest_morph(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) (/home/user/hunspell/out/fuzzer+0x19dd42) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)
    #2 0x559b64568566 in HunspellImpl::analyze_internal(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) (/home/user/hunspell/out/fuzzer+0x153566) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)
    #3 0x559b64566494 in HunspellImpl::analyze(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) (/home/user/hunspell/out/fuzzer+0x151494) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)
    #4 0x559b6456bb30 in HunspellImpl::generate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) (/home/user/hunspell/out/fuzzer+0x156b30) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)
    #5 0x559b6456f30d in HunspellImpl::generate(char***, char const*, char const*) (/home/user/hunspell/out/fuzzer+0x15a30d) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)
    #6 0x559b64570f4c in Hunspell_generate (/home/user/hunspell/out/fuzzer+0x15bf4c) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)
    #7 0x559b64550c74 in LLVMFuzzerTestOneInput (/home/user/hunspell/out/fuzzer+0x13bc74) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)
    #8 0x559b6445e4c4 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/home/user/hunspell/out/fuzzer+0x494c4) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)
    #9 0x559b64447413 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/home/user/hunspell/out/fuzzer+0x32413) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)
    #10 0x559b6444d036 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/home/user/hunspell/out/fuzzer+0x38036) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)
    #11 0x559b64477b26 in main (/home/user/hunspell/out/fuzzer+0x62b26) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)
    #12 0x7f29622456c9 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #13 0x7f2962245784 in __libc_start_main csu/../csu/libc-start.c:360:3
    #14 0x559b64441e80 in _start (/home/user/hunspell/out/fuzzer+0x2ce80) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)

Address 0x7f2960901060 is located in stack of thread T0 at offset 96 in frame
    #0 0x559b645f07ef in AffixMgr::compound_check_morph(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, short, short, short, short, hentry**, hentry**, char, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const*) (/home/user/hunspell/out/fuzzer+0x1db7ef) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762)

  This frame has 42 object(s):
    [32, 40) 'words.addr'
    [64, 96) 'st' <== Memory access at offset 96 overflows this variable
    [128, 160) 'presult'
    [192, 200) 'cmin'
    [224, 232) 'cmax'
    [256, 264) 'clock_now'
    [288, 296) 'ref.tmp'
    [320, 328) 'ref.tmp4'
    [352, 384) 'p'
    [416, 448) 'ref.tmp325'
    [480, 512) 'ref.tmp340'
    [544, 576) 'ref.tmp353'
    ...
    [1648, 1680) 'ref.tmp1753'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (/home/user/hunspell/out/fuzzer+0x1dc546) (BuildId: 7c8731a8dd223174d57c567cf90fe1e6520ca762) in AffixMgr::compound_check_morph(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, short, short, short, short, hentry**, hentry**, char, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const*)
Shadow bytes around the buggy address:
  0x7f2960900d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7f2960900e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7f2960900e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7f2960900f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7f2960900f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x7f2960901000: f1 f1 f1 f1 00 f2 f2 f2 00 00 00 00[f2]f2 f2 f2
  0x7f2960901080: 00 00 00 00 f2 f2 f2 f2 00 f2 f2 f2 00 f2 f2 f2
  0x7f2960901100: 00 f2 f2 f2 f8 f2 f2 f2 f8 f2 f2 f2 f8 f8 f8 f8
  0x7f2960901180: f2 f2 f2 f2 f8 f8 f8 f8 f2 f2 f2 f2 f8 f8 f8 f8
  0x7f2960901200: f2 f2 f2 f2 f8 f8 f8 f8 f2 f2 f2 f2 f8 f2 f8 f2
  0x7f2960901280: f8 f2 f8 f2 f8 f2 f8 f2 f8 f2 f8 f8 f8 f8 f2 f2
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==2467705==ABORTING

Sorry if it doesn't make sense to use the API this way or pass it untrusted input.

@jonathanmetzman
Copy link
Author

@caolanm maybe you know if this is a real bug.

@caolanm
Copy link
Contributor

caolanm commented Feb 5, 2024

This looks similar to the fuzzer of src/tools/affdicfuzzer.cxx except instead of spellchecking and getting suggestions it uses the "generate" thing in hunspell that has the job of generating words from a given root via a given rule.

So like that other fuzzer it needs a custom .aff and .dic along with the input word. So, sure it's a crash and worth fixing, but on the face of things I'm not super concerned unless an attacker can also provide the aff and dic too, which at least for the LibreOffice, and I presume general, case isn't typically possible.

In LibreOffice IIUC we have some thesaurus integration with hunspell so if there is a request to get a synonym for e.g. "walking" we might attempt to analyze that to get "walk", get synonyms of "walk" and then attempt to generate "ing" variants of those synonyms. That only happens either with direct calls to the analyze apis or via the spellml stuff seen in man 3 hunspell. I guess if there is concern here, then one could patch out the spellml support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants