X86 is a pretty messy ISA to target for a compiler backend. Luckily, the X86-64 variant has been cleaned up somewhat and there are excellent encoding/decoding tables we can build on.
The instruction decoder/encoder is based on x86data.js by Petr Kobalicek which is in the Public Domain.
The implementation is far more complete than what is needed for a the code generator component and might be useful by itself.
The currently supported instructions are listed at the beginning of [opcode_tab.py] and should cover > 95% of instructions found in a typical executable.
Use objdump -d -M intel <file.exe>
for intel assembler syntax.
To see the decoder in action try:
objdump -d -M intel --insn-width=12 /usr/bin/bash | ./opcode_test.py
Unsupported:
- segment registers, creg, dreg, sreg
- registers ah, bh, ch, dh
- MMX(2), AVX instructions
- Rep prefix
- sib addressing mode where i=0x4 is handled like a regular sib addressing mode
Planned:
- Lock prefix support
- a mechanism for substituting an instruction with an equivalent but shorter one.
- https://github.com/asmjit/asmdb/blob/master/x86data.js (basis for our de-/encoder)
- https://asmjit.com/asmgrid/ (browsable version of above)
- https://github.com/asmjit/asmjit/blob/master/src/asmjit/x86/x86instdb.cpp
- http://ref.x86asm.net/geek64-abc.html (alternative table)
- http://ref.x86asm.net/index.html (explanation)
- sandpile.org https://sandpile.org/
- https://defuse.ca/online-x86-assembler.htm#disassembly2
- https://www.felixcloutier.com/x86/
- https://www.jeetizee.com/x86_ref_book_web/instruction/ (similar to above)
- https://godbolt.org/ check target code generated by various compilers
- https://wiki.osdev.org/X86-64_Instruction_Encoding
- http://brokenthorn.com/Resources/OSDevX86.html
- https://www.cs.cmu.edu/~410/doc/intel-isr.pdf
- https://pyokagan.name/blog/2019-09-20-x86encoding/
- zydis https://github.com/zyantific/zydis
- xed https://github.com/intelxed/xed
- PeachPy https://github.com/Maratyszcza/PeachPy
- https://reverseengineering.stackexchange.com/questions/12379/xclist-of-x86-x64-instructions-that-implicitly-access-registers
- https://www.akkadia.org/drepper/x86-opcode-structure.pdf
- http://www.egr.unlv.edu/~ed/assembly64.pdf