This is the CVE that I found in OpenCC.

The link on mitre is here:
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-16982

Let’s debug the binary.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
(gdb) bt
#0 strlen () at ../sysdeps/x86_64/strlen.S:106
#1 0x00007ffff7b667d7 in opencc::DictEntry::KeyLength (this=<optimized out>) at /home/orlog/Documents/OpenCC/src/DictEntry.hpp:44
#2 opencc::BinaryDict::KeyMaxLength (this=<optimized out>) at /home/orlog/Documents/OpenCC/src/BinaryDict.cpp:27
#3 0x00007ffff7b8e406 in opencc::DartsDict::NewFromFile (fp=<optimized out>) at /home/orlog/Documents/OpenCC/src/DartsDict.cpp:122
#4 0x00007ffff7b7ee76 in opencc::SerializableDict::TryLoadFromFile<opencc::DartsDict> (fileName=..., dict=0x7fffffffdb90)
at /home/orlog/Documents/OpenCC/src/SerializableDict.hpp:62
Python Exception <class 'gdb.error'> There is no member named _M_dataplus.:
#5 0x00007ffff7b9f42f in opencc::SerializableDict::NewFromFile<opencc::DartsDict> (fileName=) at /home/orlog/Documents/OpenCC/src/SerializableDict.hpp:71
Python Exception <class 'gdb.error'> There is no member named _M_dataplus.:
Python Exception <class 'gdb.error'> There is no member named _M_dataplus.:
#6 0x00007ffff7b9e0a5 in LoadDictionary (format=, inputFileName=) at /home/orlog/Documents/OpenCC/src/DictConverter.cpp:29
#7 0x00007ffff7b9e367 in opencc::ConvertDictionary (inputFileName=<incomplete type>, outputFileName=<incomplete type>, formatFrom=<incomplete type>,
formatTo=<incomplete type>) at /home/orlog/Documents/OpenCC/src/DictConverter.cpp:53
#8 0x0000000000407126 in main (argc=<optimized out>, argv=<optimized out>) at /home/orlog/Documents/OpenCC/src/tools/DictConverter.cpp:47

Here, we guess that the problem is in the key(). But I ran key() it’s ok. Then I ran printf(key()) and strlen(key()) there is a segment fault.

We know key() returns a const char *. So it must be something wrong with the pointer. So let’s stop just before the printf we added.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
(gdb) info locals 
s = 0xe428c0 <error: Cannot access memory at address 0xe428c0>
(gdb) x/128wx 0x642820
0x642820: 0x000000f3 0x00000000 0x000000f2 0x00000000
0x642830: 0x000000f5 0x00000000 0x000000f4 0x00000000
0x642840: 0x000000f7 0x00000000 0x000000f6 0x00000000
0x642850: 0x000000f9 0x00000000 0x000000f8 0x00000000
0x642860: 0x000000fb 0x00000000 0x000000fa 0x00000000
0x642870: 0x000000fd 0x00000000 0x000000fc 0x00000000
0x642880: 0x000000ff 0x00000000 0x000000fe 0x00000000
0x642890: 0x00000000 0x00000000 0x00000021 0x00000000
0x6428a0: 0xf7dd0f88 0x00007fff 0x00000001 0x00000001
0x6428b0: 0x00640dd0 0x00000000 0x00000091 0x00000000
0x6428c0: 0xe6978de5 0xe5008cb6 0xace6a7a4 0x8cb6e696 The start of the keys
0x6428d0: 0xa7a4e500 0x008cb6e6 0xe6b19de6 0xe6008cb6
0x6428e0: 0xb2e699b2 0x8cb6e6b3 0x99b2e600 0xe69aade9
0x6428f0: 0xe6008cb6 0xb6e6b3b2 0xb3e6008c 0x8cb6e6a5
0x642900: 0x8cb6e600 0x00beb0e5 0xe6b1b7e6 0xe6008cb6
0x642910: 0xb6e6aaba 0x91e8008c 0x8cb6e6b5 0x94a0e800
0x642920: 0x008cb6e6 0xe6bfa5e8 0xe9008cb6 0xade982b0
0x642930: 0x8cb6e69a 0xbbbae900 0x008cb6e6 0xe68ebbe9
0x642940: 0x00008cb6 0x00000000 0x00000091 0x00000000
0x642950: 0xe6978de5 0xe5008cb6 0xace6a7a4 0x8cb6e696
0x642960: 0xa7a4e500 0x008cb6e6 0xe6b19de6 0xe6008cb6
0x642970: 0xb2e699b2 0x8cb6e6b3 0x99b2e600 0xe69aade9
0x642980: 0xe6008cb6 0xb6e6b3b2 0xb3e6008c 0x8cb6e6a5
0x642990: 0x8cb6e600 0x00beb0e5 0xe6b1b7e6 0xe6008cb6
0x6429a0: 0xb6e6aaba 0x91e8008c 0x8cb6e6b5 0x94a0e800
0x6429b0: 0x008cb6e6 0xe6bfa5e8 0xe9008cb6 0xade982b0
0x6429c0: 0x8cb6e69a 0xbbbae900 0x008cb6e6 0xe68ebbe9
0x6429d0: 0x00008cb6 0x00000000 0x00000021 0x00000000
0x6429e0: 0x00000000 0x00000000 0x00000000 0x00000000
0x6429f0: 0x00000000 0x00000000 0x00000031 0x00000000
0x642a00: 0xf7dd1048 0x00007fff 0x00e428c0 0x00000000 e428c0 is the start of the keys.
0x642a10: 0x00642a30 0x00000000 0x00642a38 0x00000000

Compared to the good sample we can see, e428c0 should be the start of keys but it points to some weird place.

So I assume that the problem might be here BinaryDict::NewFromFile, we probably changed some offset. Change the source code and see all those offsets.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[+]numItems:17
[+]keyTotalLength:131
[+]keybuffer:南涌
[+]valueTotalLength:131
[+]numValues:1
[+]keyOffset:800000
[+]keyBuffer:南涌
Segmentation fault
orlog@hero:~/Downloads/OpenCC/build/dbg/src/tools$ ./opencc_dict -i ../../data/HKVariantsPhrases.ocd -o temp.txt -f ocd -t text
[+]numItems:17
[+]keyTotalLength:131
[+]keybuffer:南涌
[+]valueTotalLength:131
[+]numValues:1
[+]keyOffset:0

Ok, we now find the problem is in the offset.