diff options
author | Malfurious <m@lfurio.us> | 2023-02-22 15:01:23 -0500 |
---|---|---|
committer | dusoleil <howcansocksbereal@gmail.com> | 2023-02-24 03:39:09 -0500 |
commit | d49577bdb3cd352fcbdab26391711ccbfcca82ec (patch) | |
tree | 614fb293c5ed6d041515c98b67766697dd285eb7 | |
parent | a4543a9ffcc52f205a8c2aaa56909acda4e9d0b1 (diff) | |
download | sploit-d49577bdb3cd352fcbdab26391711ccbfcca82ec.tar.gz sploit-d49577bdb3cd352fcbdab26391711ccbfcca82ec.zip |
symtbl: Refactor module as an improved container type (and more)
This effort was triggered by three immediate wants of the module:
An improved data container interface to support things like key
iteration and better key management. This is primarily wanted by
the ROP module (which is still in development).
The introduction of package documentation across the project. This
module is now fully documented.
To fix a bug in the Symtbl constructor, which would not allow a
caller to supply "self" as an initial symbol name, even though it is
legal in every other context. This problem was caused by the
constructor's bound instance parameter sharing this name.
This patch addresses all of these concerns, and also introduces some
fringe / QoL improvements that were discovered during the API refactor.
Element access may now be done via subscripting, as well as the previous
(and still generally perferred) .attribute notation. The syntax for
storing subtables within a parent Symtbl is now greatly streamlined due
to some implementation-level changes to the class. You may now directly
assign just a Symtbl object or a normal int, and you don't have to fuss
with tuples anymore. The subtable's base is taken as its offset in the
parent, and the new operator replacement for the .map() method may be
used to define a desired value for the parent.
This detail is actually a breaking change compared to the previous
version. While not technically a bug, it is unintuitive that the
previous version would not remove subtables when their offset was
changed by a simple assignment - the table would just move. This patch
make it such that any symbol assignment to a regular int will replace an
old mounted subtable if one exists.
There are now no normal instance methods on the Symtbl type (only dunder
method overrides). This is to free up the available symbol namespace as
much as possible. The previous methods map(), adjust(), and rebase()
are now implemented as operators which, in every case, yield a new
derivative object, rather than mutating the original. All operators are
listed here:
@ remap to absolute address
+ remap to relative address
- remap to negated relative address
>> adjust all symbol offsets upward
<< adjust all symbol offsets downward
% rebase all symbol offsets around an absolute zero point
Additionally, Symtbl objects will convert to an integer via int(),
hex(), oct(), or bin(), yielding the base value.
The addition of these operators presents another breaking change to the
previous version. Previously, symbol adjustments or rebases affected
the tracked offsets and caused symbols to shift around in linked tables
as well. Since these operators now preserve the state of their source
object, this is no longer the case. The amount of shift due to
adjustment or rebasing is localized in a specific Symtbl instance (and
is affected the the use of the related operators), however this value is
inherited by derivatives of that object.
There is a third breaking change caused by the use of operators as well.
Previously, the map() function allowed the caller to specify that the
given absolute address is not that of the table base, but of some offset
in the table, from which the new base is calculated. However, the
remapping operators take only a single numeric value as their right hand
side operand, which is the absolute or relative address. The new
intended way of accomplishing this (which is _nearly_ equivalent) is
through the combined use of the rebase and remap operations:
# The address of the puts() function in a libc tbl is leaked
sym = sym % sym.puts @ leak
aka: adjust offsets such that the known point is at the base, then move
that base to the known location. The way in which this is different to
what you would end up with before is that previously, following a
map(abs, off) the base of the table would be accurately valued
according to the known information. Now, the 'base' is considered to be
the leaked value, but internal offsets are shifted such that they still
resolve correctly.
Finally, a few new pieces of functionality are added to build out the
container API:
- symbol key deletion
- iteration over symbol:offset pairs
- can now check for symbol existence with the "in" keyword
- len(symtbl) returns the number of symbols defined
Signed-off-by: Malfurious <m@lfurio.us>
Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
-rw-r--r-- | sploit/payload.py | 6 | ||||
-rw-r--r-- | sploit/symtbl.py | 233 |
2 files changed, 185 insertions, 54 deletions
diff --git a/sploit/payload.py b/sploit/payload.py index 1110e76..1775ceb 100644 --- a/sploit/payload.py +++ b/sploit/payload.py @@ -26,13 +26,13 @@ class Payload: return f'{kind}_{ctr}' def _append(self, value, sym): - setattr(self.sym.map(0), sym, len(self)) + (self.sym @ 0)[sym] = len(self) self.payload += value return self def _prepend(self, value, sym): - self.sym.adjust(len(value)) - setattr(self.sym.map(0), sym, 0) + self.sym >>= len(value) + (self.sym @ 0)[sym] = 0 self.payload = value + self.payload return self diff --git a/sploit/symtbl.py b/sploit/symtbl.py index 3a3e697..05021f7 100644 --- a/sploit/symtbl.py +++ b/sploit/symtbl.py @@ -1,53 +1,184 @@ -import types - -class Symtbl: - def __init__(self, *, base=0, **kwargs): - object.__setattr__(self, '_namesp', types.SimpleNamespace(base=base,sym={},sub={})) - for k, v in {**kwargs}.items(): - setattr(self, k, v) - - def __getattr__(self, ident): - self = self._namesp - if ident == 'base': return self.base - off = self.base + self.sym[ident] - if ident in self.sub: return self.sub[ident].map(off) - return off - - def __setattr__(self, ident, value): - if ident in dir(self): raise Exception(f'Symtbl: assignment would shadow non-symbol "{ident}"') - self = self._namesp - if ident == 'base': - self.base = value +""" +Symtbl data structure + +A Symtbl (symbol table) is an associative data container intended to model +arbitrary memory layouts, such as structure definitions or memory-mapped +objects. Elements may be accessed via subscript or attribute notation. + +A Symtbl is essentially a dictionary, in which each key (symbol name string) +is associated with an offset value. A special key "base" represents the +base or starting address of the overall table in memory. Whenever offset +values are accessed, they are adjusted relative to the table's base value. +This enables the primary function of Symtbl objects: the ability to resolve +mapped, or absolute, addresses of objects in memory. + +Therefore, even though a Symtbl internally tracks symbol offsets, the apparent +value of any symbol will always be its offset plus the table's base address. +The table's base address will also be subtracted from values being stored in +the table, as the provided value is assumed to be mapped in the same manner as +the table itself. + + s = Symtbl() + s.a = 10 + s.b = 20 + print(s.a, s.b) # "10 20" + s.base = 100 + print(s.a, s.b) # "110 120" + s.c = 150 + s.base = 10 + print(s.a, s.b, s.c) # "20 30 60" + +A Symtbl's base value may be changed at any time, and this will affect the +interpretation of offsets as described above. However, one may also create a +remapped version of a Symtbl (without modifying the original) using the '@' +operator. This new object will have the base value given on the right hand +side of the '@' and its collection of symbols is referentially linked to the +source object, meaning changes to symbol entries will be visible in both +objects. + + s1 = Symtbl() + s1.a = 10 + s2 = s1 @ 1000 + print(s1.a, s2.a) # "10 1010" + s2.b = 1234 + print(s1.b, s2.b) # "234 1234" + +Symtbl's are also nestable, to support modeling composite memory layouts. If +a symbol's value is assigned to another Symtbl object, rather than an integer +offset, the child object's base value serves as its offset in the parent +Symtbl. Symbols on the child object may then be accessed recursively from the +parent's scope. If the parent has a non-zero base, it adjusts the offsets +interpreted in the child. + + child = Symtbl() + child.a = 1 + child.b = 2 + parent = Symtbl() + parent.nested = child @ 70 + print(parent.nested.a, parent.nested.b) # "71 72" + +A Symtbl will allow you to uniformly adjust all offsets contained, while leaving +the base value the same, using the '<<' and '>>' operators. A custom +"rebase" operation is also available via the "%" operator. A rebase applies +a uniform shift, such that the right hand side offset operand ends up coinciding +with the Symtbl base address. + + s = Symtbl() + s.a = 1 + s.b = 2 + s.c = 3 + s.d = 4 + s.base = 1000 + s %= s.c # rebase at symbol 'c' + print(s.a, s.b, s.c, s.d) # "998 999 1000 1001" +""" + +def Symtbl(*, base=0, **symbols): + """ + Create a new Symtbl object + + Return an empty Symtbl or, optionally, one initialized with the given + symbol values. Arguments _must_ be keyword arguments. + + Users should call this function instead of attempting to construct the + Symtbl class. Construction is implemented via a normal function to prevent + any argument name from conflicting with __init__'s bound instance parameter. + """ + self = SymtblImpl({}, 0, base) + for k, v in symbols.items(): + self[k] = v + return self + +class SymtblImpl: + """Symtbl implementation class""" + + def __init__(self, entries, adjust, base): + """Construct Symtbl from instance data""" + object.__setattr__(self, "__entries__", entries) + object.__setattr__(self, "__adjust__", adjust) + object.__setattr__(self, "base", base) + + def __index__(self): + """Convert object to integer using base value""" + return self.base + + def __matmul__(self, base): + """Create remapped version of object at absolute base""" + return SymtblImpl(self.__entries__, self.__adjust__, int(base)) + + def __add__(self, offset): + """Create remapped version of object at relative base""" + return self @ (self.base + offset) + + def __sub__(self, offset): + """Create remapped version of object at relative base""" + return self @ (self.base - offset) + + def __rshift__(self, offset): + """Create symbol adjusted version of object""" + return SymtblImpl(self.__entries__, self.__adjust__ + int(offset), self.base) + + def __lshift__(self, offset): + """Create symbol adjusted version of object""" + return self >> (-offset) + + def __mod__(self, offset): + """Create symbol rebased version of object""" + return self >> (self.base - offset) + + def __getattr__(self, symbol): + """Return symbol offset or subtable via pseudo-attribute""" + return self[symbol] + + def __setattr__(self, symbol, value): + """Set symbol offset or subtable via pseudo-attribute""" + self[symbol] = value + + def __delattr__(self, symbol): + """Unset symbol via pseudo-attribute""" + del self[symbol] + + def __len__(self): + """Return number of defined symbols""" + return len(self.__entries__) + + def __getitem__(self, symbol): + """Return symbol offset or subtable via subscript""" + if symbol == "base": + return self.base + return self.__entries__[symbol] + (self.base + self.__adjust__) + + def __setitem__(self, symbol, value): + """Set symbol offset or subtable via subscript""" + if symbol == "base": + object.__setattr__(self, "base", int(value)) + elif symbol in dir(self): + raise KeyError(f"Symtbl: name '{symbol}' is reserved") else: - if type(value) is tuple: self.sub[ident], off = value - else: off = value - self.sym[ident] = off - self.base - - def map(self, addr, off=0): - self = self._namesp - mm = Symtbl() - mm._namesp.sym, mm._namesp.sub = self.sym, self.sub - mm._namesp.base = addr - off - return mm - - def adjust(self, off): - self = self._namesp - for k, v in self.sym.items(): - self.sym[k] = v + off - - def rebase(self, off): - self.adjust(self.base - off) - - def __str__(_self): - FMT = '\n{:<20} {:<20}' - self = _self._namesp - - s = f'{len(self.sym)} symbols @ {hex(_self.base)}' - s += FMT.format('ADDRESS', 'SYMBOL') - for sym, _ in sorted(self.sym.items(), key=lambda x:x[1]): - addr = getattr(_self, sym) - if type(addr) is Symtbl: - s += FMT.format(hex(addr.base), f'[{sym}]') - else: - s += FMT.format(hex(addr), sym) + self.__entries__[symbol] = value - (self.base + self.__adjust__) + + def __delitem__(self, symbol): + """Unset symbol via subscript""" + del self.__entries__[symbol] + + def __iter__(self): + """Iterate over table entries as key:value tuples, like dict.items()""" + return iter({ k: self[k] for k in self.__entries__ }.items()) + + def __contains__(self, symbol): + """Test symbol name membership in table""" + return symbol in self.__entries__ + + def __repr__(self): + """Return string representation of Symtbl""" + return str(self) + + def __str__(self): + """Return string representation of Symtbl""" + FMT = "\n{:<20} {:<20}" + s = f"{len(self)} symbols @ {hex(self)}" + s += FMT.format("ADDRESS", "SYMBOL") + for symbol, offset in sorted(self, key=lambda v: int(v[1])): + disp = f"[{symbol}]" if type(offset) is SymtblImpl else symbol + s += FMT.format(hex(offset), disp) return s |