summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorMalfurious <m@lfurio.us>2023-02-22 15:01:23 -0500
committerdusoleil <howcansocksbereal@gmail.com>2023-02-24 03:39:09 -0500
commitd49577bdb3cd352fcbdab26391711ccbfcca82ec (patch)
tree614fb293c5ed6d041515c98b67766697dd285eb7
parenta4543a9ffcc52f205a8c2aaa56909acda4e9d0b1 (diff)
downloadsploit-d49577bdb3cd352fcbdab26391711ccbfcca82ec.tar.gz
sploit-d49577bdb3cd352fcbdab26391711ccbfcca82ec.zip
symtbl: Refactor module as an improved container type (and more)
This effort was triggered by three immediate wants of the module: An improved data container interface to support things like key iteration and better key management. This is primarily wanted by the ROP module (which is still in development). The introduction of package documentation across the project. This module is now fully documented. To fix a bug in the Symtbl constructor, which would not allow a caller to supply "self" as an initial symbol name, even though it is legal in every other context. This problem was caused by the constructor's bound instance parameter sharing this name. This patch addresses all of these concerns, and also introduces some fringe / QoL improvements that were discovered during the API refactor. Element access may now be done via subscripting, as well as the previous (and still generally perferred) .attribute notation. The syntax for storing subtables within a parent Symtbl is now greatly streamlined due to some implementation-level changes to the class. You may now directly assign just a Symtbl object or a normal int, and you don't have to fuss with tuples anymore. The subtable's base is taken as its offset in the parent, and the new operator replacement for the .map() method may be used to define a desired value for the parent. This detail is actually a breaking change compared to the previous version. While not technically a bug, it is unintuitive that the previous version would not remove subtables when their offset was changed by a simple assignment - the table would just move. This patch make it such that any symbol assignment to a regular int will replace an old mounted subtable if one exists. There are now no normal instance methods on the Symtbl type (only dunder method overrides). This is to free up the available symbol namespace as much as possible. The previous methods map(), adjust(), and rebase() are now implemented as operators which, in every case, yield a new derivative object, rather than mutating the original. All operators are listed here: @ remap to absolute address + remap to relative address - remap to negated relative address >> adjust all symbol offsets upward << adjust all symbol offsets downward % rebase all symbol offsets around an absolute zero point Additionally, Symtbl objects will convert to an integer via int(), hex(), oct(), or bin(), yielding the base value. The addition of these operators presents another breaking change to the previous version. Previously, symbol adjustments or rebases affected the tracked offsets and caused symbols to shift around in linked tables as well. Since these operators now preserve the state of their source object, this is no longer the case. The amount of shift due to adjustment or rebasing is localized in a specific Symtbl instance (and is affected the the use of the related operators), however this value is inherited by derivatives of that object. There is a third breaking change caused by the use of operators as well. Previously, the map() function allowed the caller to specify that the given absolute address is not that of the table base, but of some offset in the table, from which the new base is calculated. However, the remapping operators take only a single numeric value as their right hand side operand, which is the absolute or relative address. The new intended way of accomplishing this (which is _nearly_ equivalent) is through the combined use of the rebase and remap operations: # The address of the puts() function in a libc tbl is leaked sym = sym % sym.puts @ leak aka: adjust offsets such that the known point is at the base, then move that base to the known location. The way in which this is different to what you would end up with before is that previously, following a map(abs, off) the base of the table would be accurately valued according to the known information. Now, the 'base' is considered to be the leaked value, but internal offsets are shifted such that they still resolve correctly. Finally, a few new pieces of functionality are added to build out the container API: - symbol key deletion - iteration over symbol:offset pairs - can now check for symbol existence with the "in" keyword - len(symtbl) returns the number of symbols defined Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
-rw-r--r--sploit/payload.py6
-rw-r--r--sploit/symtbl.py233
2 files changed, 185 insertions, 54 deletions
diff --git a/sploit/payload.py b/sploit/payload.py
index 1110e76..1775ceb 100644
--- a/sploit/payload.py
+++ b/sploit/payload.py
@@ -26,13 +26,13 @@ class Payload:
return f'{kind}_{ctr}'
def _append(self, value, sym):
- setattr(self.sym.map(0), sym, len(self))
+ (self.sym @ 0)[sym] = len(self)
self.payload += value
return self
def _prepend(self, value, sym):
- self.sym.adjust(len(value))
- setattr(self.sym.map(0), sym, 0)
+ self.sym >>= len(value)
+ (self.sym @ 0)[sym] = 0
self.payload = value + self.payload
return self
diff --git a/sploit/symtbl.py b/sploit/symtbl.py
index 3a3e697..05021f7 100644
--- a/sploit/symtbl.py
+++ b/sploit/symtbl.py
@@ -1,53 +1,184 @@
-import types
-
-class Symtbl:
- def __init__(self, *, base=0, **kwargs):
- object.__setattr__(self, '_namesp', types.SimpleNamespace(base=base,sym={},sub={}))
- for k, v in {**kwargs}.items():
- setattr(self, k, v)
-
- def __getattr__(self, ident):
- self = self._namesp
- if ident == 'base': return self.base
- off = self.base + self.sym[ident]
- if ident in self.sub: return self.sub[ident].map(off)
- return off
-
- def __setattr__(self, ident, value):
- if ident in dir(self): raise Exception(f'Symtbl: assignment would shadow non-symbol "{ident}"')
- self = self._namesp
- if ident == 'base':
- self.base = value
+"""
+Symtbl data structure
+
+A Symtbl (symbol table) is an associative data container intended to model
+arbitrary memory layouts, such as structure definitions or memory-mapped
+objects. Elements may be accessed via subscript or attribute notation.
+
+A Symtbl is essentially a dictionary, in which each key (symbol name string)
+is associated with an offset value. A special key "base" represents the
+base or starting address of the overall table in memory. Whenever offset
+values are accessed, they are adjusted relative to the table's base value.
+This enables the primary function of Symtbl objects: the ability to resolve
+mapped, or absolute, addresses of objects in memory.
+
+Therefore, even though a Symtbl internally tracks symbol offsets, the apparent
+value of any symbol will always be its offset plus the table's base address.
+The table's base address will also be subtracted from values being stored in
+the table, as the provided value is assumed to be mapped in the same manner as
+the table itself.
+
+ s = Symtbl()
+ s.a = 10
+ s.b = 20
+ print(s.a, s.b) # "10 20"
+ s.base = 100
+ print(s.a, s.b) # "110 120"
+ s.c = 150
+ s.base = 10
+ print(s.a, s.b, s.c) # "20 30 60"
+
+A Symtbl's base value may be changed at any time, and this will affect the
+interpretation of offsets as described above. However, one may also create a
+remapped version of a Symtbl (without modifying the original) using the '@'
+operator. This new object will have the base value given on the right hand
+side of the '@' and its collection of symbols is referentially linked to the
+source object, meaning changes to symbol entries will be visible in both
+objects.
+
+ s1 = Symtbl()
+ s1.a = 10
+ s2 = s1 @ 1000
+ print(s1.a, s2.a) # "10 1010"
+ s2.b = 1234
+ print(s1.b, s2.b) # "234 1234"
+
+Symtbl's are also nestable, to support modeling composite memory layouts. If
+a symbol's value is assigned to another Symtbl object, rather than an integer
+offset, the child object's base value serves as its offset in the parent
+Symtbl. Symbols on the child object may then be accessed recursively from the
+parent's scope. If the parent has a non-zero base, it adjusts the offsets
+interpreted in the child.
+
+ child = Symtbl()
+ child.a = 1
+ child.b = 2
+ parent = Symtbl()
+ parent.nested = child @ 70
+ print(parent.nested.a, parent.nested.b) # "71 72"
+
+A Symtbl will allow you to uniformly adjust all offsets contained, while leaving
+the base value the same, using the '<<' and '>>' operators. A custom
+"rebase" operation is also available via the "%" operator. A rebase applies
+a uniform shift, such that the right hand side offset operand ends up coinciding
+with the Symtbl base address.
+
+ s = Symtbl()
+ s.a = 1
+ s.b = 2
+ s.c = 3
+ s.d = 4
+ s.base = 1000
+ s %= s.c # rebase at symbol 'c'
+ print(s.a, s.b, s.c, s.d) # "998 999 1000 1001"
+"""
+
+def Symtbl(*, base=0, **symbols):
+ """
+ Create a new Symtbl object
+
+ Return an empty Symtbl or, optionally, one initialized with the given
+ symbol values. Arguments _must_ be keyword arguments.
+
+ Users should call this function instead of attempting to construct the
+ Symtbl class. Construction is implemented via a normal function to prevent
+ any argument name from conflicting with __init__'s bound instance parameter.
+ """
+ self = SymtblImpl({}, 0, base)
+ for k, v in symbols.items():
+ self[k] = v
+ return self
+
+class SymtblImpl:
+ """Symtbl implementation class"""
+
+ def __init__(self, entries, adjust, base):
+ """Construct Symtbl from instance data"""
+ object.__setattr__(self, "__entries__", entries)
+ object.__setattr__(self, "__adjust__", adjust)
+ object.__setattr__(self, "base", base)
+
+ def __index__(self):
+ """Convert object to integer using base value"""
+ return self.base
+
+ def __matmul__(self, base):
+ """Create remapped version of object at absolute base"""
+ return SymtblImpl(self.__entries__, self.__adjust__, int(base))
+
+ def __add__(self, offset):
+ """Create remapped version of object at relative base"""
+ return self @ (self.base + offset)
+
+ def __sub__(self, offset):
+ """Create remapped version of object at relative base"""
+ return self @ (self.base - offset)
+
+ def __rshift__(self, offset):
+ """Create symbol adjusted version of object"""
+ return SymtblImpl(self.__entries__, self.__adjust__ + int(offset), self.base)
+
+ def __lshift__(self, offset):
+ """Create symbol adjusted version of object"""
+ return self >> (-offset)
+
+ def __mod__(self, offset):
+ """Create symbol rebased version of object"""
+ return self >> (self.base - offset)
+
+ def __getattr__(self, symbol):
+ """Return symbol offset or subtable via pseudo-attribute"""
+ return self[symbol]
+
+ def __setattr__(self, symbol, value):
+ """Set symbol offset or subtable via pseudo-attribute"""
+ self[symbol] = value
+
+ def __delattr__(self, symbol):
+ """Unset symbol via pseudo-attribute"""
+ del self[symbol]
+
+ def __len__(self):
+ """Return number of defined symbols"""
+ return len(self.__entries__)
+
+ def __getitem__(self, symbol):
+ """Return symbol offset or subtable via subscript"""
+ if symbol == "base":
+ return self.base
+ return self.__entries__[symbol] + (self.base + self.__adjust__)
+
+ def __setitem__(self, symbol, value):
+ """Set symbol offset or subtable via subscript"""
+ if symbol == "base":
+ object.__setattr__(self, "base", int(value))
+ elif symbol in dir(self):
+ raise KeyError(f"Symtbl: name '{symbol}' is reserved")
else:
- if type(value) is tuple: self.sub[ident], off = value
- else: off = value
- self.sym[ident] = off - self.base
-
- def map(self, addr, off=0):
- self = self._namesp
- mm = Symtbl()
- mm._namesp.sym, mm._namesp.sub = self.sym, self.sub
- mm._namesp.base = addr - off
- return mm
-
- def adjust(self, off):
- self = self._namesp
- for k, v in self.sym.items():
- self.sym[k] = v + off
-
- def rebase(self, off):
- self.adjust(self.base - off)
-
- def __str__(_self):
- FMT = '\n{:<20} {:<20}'
- self = _self._namesp
-
- s = f'{len(self.sym)} symbols @ {hex(_self.base)}'
- s += FMT.format('ADDRESS', 'SYMBOL')
- for sym, _ in sorted(self.sym.items(), key=lambda x:x[1]):
- addr = getattr(_self, sym)
- if type(addr) is Symtbl:
- s += FMT.format(hex(addr.base), f'[{sym}]')
- else:
- s += FMT.format(hex(addr), sym)
+ self.__entries__[symbol] = value - (self.base + self.__adjust__)
+
+ def __delitem__(self, symbol):
+ """Unset symbol via subscript"""
+ del self.__entries__[symbol]
+
+ def __iter__(self):
+ """Iterate over table entries as key:value tuples, like dict.items()"""
+ return iter({ k: self[k] for k in self.__entries__ }.items())
+
+ def __contains__(self, symbol):
+ """Test symbol name membership in table"""
+ return symbol in self.__entries__
+
+ def __repr__(self):
+ """Return string representation of Symtbl"""
+ return str(self)
+
+ def __str__(self):
+ """Return string representation of Symtbl"""
+ FMT = "\n{:<20} {:<20}"
+ s = f"{len(self)} symbols @ {hex(self)}"
+ s += FMT.format("ADDRESS", "SYMBOL")
+ for symbol, offset in sorted(self, key=lambda v: int(v[1])):
+ disp = f"[{symbol}]" if type(offset) is SymtblImpl else symbol
+ s += FMT.format(hex(offset), disp)
return s