summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2023-03-22symtbl: order symtbl iteration by offsetdusoleil1-2/+2
When iterating over a symtbl, the returned tuples should be sorted by offset. Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-19Add CONTRIBUTING docv0.3dusoleil1-0/+36
Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-19r2: limit gadget search to exec privilege sectionsdusoleil1-1/+1
Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-19builder: Add initial version of ROP chain toolsMalfurious3-4/+404
Adds a ROP-enabled payload builder under the builder namespace. Much of the behavior is parameterized by the active arch, so several new columns are added to the Arch class. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-19builder: Add rop gadget annotation classMalfurious3-1/+111
This dataclass is intended to be used directly with the new ROP builder class. GadHints allow users to teach the library about gadgets it can not find on its own and how to use them correctly. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-19payload: Add method end()Malfurious1-0/+3
To determine the address of the end of a payload, based on its Symtbl data. I believe it makes the most sense to make this a part of the Payload API, since Symtbl lacks a concept of element size. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-19Create new subpackage 'builder'Malfurious3-2/+5
This is a package to contain the related Payload and ROP modules, as well as utility classes. Payload is moved into the new package. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-19rev: Normalize the reported offset of found gadgetsMalfurious2-3/+4
ROP gadgets returned through search from the r2 API will now always contain a file-relative offset, even if they come from a non-pic binary using a fixed baddr. However, gadgets returned through the ELF API will be mapped according to the ELF's Symtbl. This ensures the correct offset is returned following a library leak, and allows the user to always safely insert an ELF-returned gadget into that ELF's Symtbl without issue. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-19symtbl: Support offset translation for int-like objectsMalfurious1-1/+1
This fixes a bug with Symtbl's __getitem__. An object that is convertable to int should also cause __getitem__ to behave as though an int was given, and translate the object as a foreign offset. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-16elf: Add docstringsdusoleil1-0/+107
Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-16elf: Automatically lookup Arch on ELF constructiondusoleil1-0/+2
Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-16elf: Add bininfo to ELF under .info and .securitydusoleil1-9/+54
On ELF construction, call r2.get_bin_info() and keep the results under the psuedo-namespaces .info and .security. Also add a pretty-print to these in a tabulated form. Also rewrite the ELF pretty-print to just summarize and not print out the entirety of .sym. Lastly, fixed a small bug where ELF could crash on construction if ldd fails (loading a non-native ELF, for instance). Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-16r2: Use get_bin_info in get_elf_symbolsdusoleil1-5/+5
Code reuse since we were using r2 iI in get_elf_symbols to get the baddr. This can cause get_bin_info to be called (and log that it's being called) multiple times, so I'm also adding the @cache annotation. Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-16r2: Add ability to lookup info about a binary.dusoleil1-0/+12
Call r2's iI command and return a subset of the fields that we care about. Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-16arch: Add Arch lookupdusoleil1-4/+18
You can now lookup a predefined Arch based on a tuple of arch_string (returned by r2 iI), wordsize, and endianness. Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-16arch: Move predefined Arch's to top of filedusoleil1-10/+17
Also added a DEFAULT_ARCH constant. Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-16arch: Move private methods to bottom of filedusoleil2-14/+17
Also check type when setting arch. Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-15r2: Increase maximum rop gadget lengthMalfurious1-1/+1
Sets the value of rop.len = 10 in r2, to give the search function more data to sift through. This is a doubling from the default value (5). Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-15rev: Update rop gadget search functionalityMalfurious2-32/+61
Development on the rop chain builder has produced this upgrade to our gadget search facility. The primary advantages in this version are increased flexibility and runtime performance. It is now easier to find specific 'stray' instructions (not immediately followed by a ret) since we search from every position in the data returned by r2. If you _do_ want a ret, just specify it in your input regexes. For this reason, a dedicated function for locating a simple 'ret' gadget is no longer present - elf.gadget("ret") is the equivalent. A major change in this version is that we now obtain and operate on r2's JSON representation of the gadget data. We now only reach out to r2 once to get all information for a binary (which is cached) and the actual 'search' is implemented in Python. This provides a significant performance speedup in cases where we need many gadgets from one binary, as r2 doesn't need to inspect the entire file each time. Additional caching is done on specific search results, so that 100% redundant searches are returned immediately. Access to the raw JSON data is made available through a new function rop_json(), but is not exposed in the ELF interface, since it seems like a niche need. Search results are returned via Gadget objects (or a list thereof), which contain regular expression Match objects for each assembly instruction found in the gadget. This allows the caller to retrieve the values contained in regular expression capture groups if present. Also, anecdotally, the search functionality in r2 has seemed to return false negatives for some queries in the past, whereas I haven't noticed similar cases with this implementation yet. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-15rev: Add rop gadget description classMalfurious2-2/+38
This new class is intended to be used to return data from gadget searches, and is able to be nested within object Symtbls. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-14symtbl: Overload __getitem__ for translating raw offsetsMalfurious1-3/+6
Can now use Symtbl subscript syntax to obtain the mapped address of a foreign offset (not a defined symbol) without having to modify the object or add a new symbol entry. Assuming a base value of 10, tbl[15] will return 25, for example. We now assert that the defined table keys are strings, to prevent the creation of entries that are now un-readable by this patch. However, this always should have been the case. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-13arch: Explicitly convert to int before type conversionsdusoleil1-1/+1
Sometimes we might be working on an object that can be treated as an int, but python won't automatically type coerce. For example, grabbing a nested symtbl and passing it in here expecting it to resolve to a type conversion of its base offset. Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-13elf: Fix visual bug printing libraries listMalfurious1-2/+2
Previously, due to precedence rules, the text produced for any library whose corresponding ELF object has already been initialized would simply be `str(lib.path)`, instead of the intended formatted string. Also fixes a typo. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-13symtbl: Only print column headings if table is populatedMalfurious1-1/+2
QoL change - Don't print the headings if the table is empty. Just report "0 symbols" and the base address. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-13symtbl: Display all nested objects in bracketsMalfurious1-1/+1
When printing a human readable Symtbl, show all nested objects within [brackets], not just Symtbl itself. Primarily useful since more types are being developed with the intent of being stored in a Symtbl. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-13Prefer __repr__ for pretty-printing objectsMalfurious2-7/+4
Define human-readable string formatting for objects in repr, rather than str, as this will enable an interactive interpreter to more conveniently show this data to the user. I believe this especially makes sense in cases where __str__ doesn't perform a semantic type conversion for its class (currently, all affected cases). Scripts can still easily yield this information by using `print(object)`, as print will fallback to repr(object) when there is not an explicitly defined __str__. Furthermore, this patch still maintains backwards compatability (for the time being) of using str(object) to retrieve the information. This is because the default __str__ implementation will defer to __repr__ if provided. This made the Symtbl case of providing both of them especially redundant. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-13payload: Add explicit width intsdusoleil1-0/+16
Signed-off-by: dusoleil <howcansocksbereal@gmail.com> Reviewed-by: Malfurious <m@lfurio.us>
2023-03-13arch: refactor byte/int conversionsdusoleil2-6/+12
The built in int's to_bytes and from_bytes functions have some weird behavior with the signed parameter. Rather than expecting the user to properly give btoi/itob the right signed value to pass through to to_bytes/from_btyes, it makes more sense to just always convert an unsigned number. Using the new int conversions, this can always be unambiguous with respect to the width of the int. There may also be situations where a user would like to truncate/sign extend an int to a certain length other than the configured architecture wordsize or convert to a different endianness. These are now parameterized. There is no need to parameterize the width for btoi because you will now always get an unsigned int back (and because of python, the width is ambiguous). The user can convert it to whatever width/sign they want after the fact with the new int conversion methods. This also means that payload's int() does not need to take a signed argument either. Whatever sign of int you give it, when it calls itob, it will get the correct bytearray at the width of the configured architecture's wordsize. Signed-off-by: dusoleil <howcansocksbereal@gmail.com> Reviewed-by: Malfurious <m@lfurio.us>
2023-03-13arch: Add explicit int conversionsdusoleil1-6/+57
Signed-off-by: dusoleil <howcansocksbereal@gmail.com> Reviewed-by: Malfurious <m@lfurio.us>
2023-03-13arch: Add docstringsdusoleil1-0/+33
Signed-off-by: dusoleil <howcansocksbereal@gmail.com> Reviewed-by: Malfurious <m@lfurio.us>
2023-03-13arch: Use dataclass instead of namedtupledusoleil1-13/+15
Python's dataclass annotation gives us a nice way to cleanly and concisely define our list of supported architectures similar to namedtuple. Unlike namedtuple, though, dataclass gives us an actual class that is significantly more feature rich and even allows us to add functionality. In general, these are meant to be like const records of info about an architecture, so we use frozen=True to enforce some const correctness. There were some issues when involving other classes for the ActiveArch feature (subclassing and composition both had their respective issues), so I'm removing __ActiveArch__ and putting a set() method directly on Arch. This method will copy a given Arch into the self object. This technically breaks const correctness as this does modify the object, but it is intended to only be used on a single sentinel Arch that represents the active arch. This arch is initialized with x86_64 by default. Signed-off-by: dusoleil <howcansocksbereal@gmail.com> Reviewed-by: Malfurious <m@lfurio.us>
2023-03-01Add special cases for read(size <= 0)dusoleil1-4/+9
A read of 0 isn't particularly useful, but it is weird that it will cause a BrokenPipeError. Instead, it makes more sense to just return an empty string. A read of <0 would normally read until EOF, but we already have that feature in readall() and it wouldn't be particularly useful here. A similar functionality of reading the entire current contents of the buffer is useful, though. This is already implemented in readall_nonblock() and this would be a nice user-facing way of calling that. Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-03-01Add io.last as the result of the last discrete readdusoleil1-1/+9
Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-24r2: Simplify Symtbl construction in get_locals()v0.2Malfurious1-3/+1
Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-24symtbl: Refactor module as an improved container type (and more)Malfurious2-54/+185
This effort was triggered by three immediate wants of the module: An improved data container interface to support things like key iteration and better key management. This is primarily wanted by the ROP module (which is still in development). The introduction of package documentation across the project. This module is now fully documented. To fix a bug in the Symtbl constructor, which would not allow a caller to supply "self" as an initial symbol name, even though it is legal in every other context. This problem was caused by the constructor's bound instance parameter sharing this name. This patch addresses all of these concerns, and also introduces some fringe / QoL improvements that were discovered during the API refactor. Element access may now be done via subscripting, as well as the previous (and still generally perferred) .attribute notation. The syntax for storing subtables within a parent Symtbl is now greatly streamlined due to some implementation-level changes to the class. You may now directly assign just a Symtbl object or a normal int, and you don't have to fuss with tuples anymore. The subtable's base is taken as its offset in the parent, and the new operator replacement for the .map() method may be used to define a desired value for the parent. This detail is actually a breaking change compared to the previous version. While not technically a bug, it is unintuitive that the previous version would not remove subtables when their offset was changed by a simple assignment - the table would just move. This patch make it such that any symbol assignment to a regular int will replace an old mounted subtable if one exists. There are now no normal instance methods on the Symtbl type (only dunder method overrides). This is to free up the available symbol namespace as much as possible. The previous methods map(), adjust(), and rebase() are now implemented as operators which, in every case, yield a new derivative object, rather than mutating the original. All operators are listed here: @ remap to absolute address + remap to relative address - remap to negated relative address >> adjust all symbol offsets upward << adjust all symbol offsets downward % rebase all symbol offsets around an absolute zero point Additionally, Symtbl objects will convert to an integer via int(), hex(), oct(), or bin(), yielding the base value. The addition of these operators presents another breaking change to the previous version. Previously, symbol adjustments or rebases affected the tracked offsets and caused symbols to shift around in linked tables as well. Since these operators now preserve the state of their source object, this is no longer the case. The amount of shift due to adjustment or rebasing is localized in a specific Symtbl instance (and is affected the the use of the related operators), however this value is inherited by derivatives of that object. There is a third breaking change caused by the use of operators as well. Previously, the map() function allowed the caller to specify that the given absolute address is not that of the table base, but of some offset in the table, from which the new base is calculated. However, the remapping operators take only a single numeric value as their right hand side operand, which is the absolute or relative address. The new intended way of accomplishing this (which is _nearly_ equivalent) is through the combined use of the rebase and remap operations: # The address of the puts() function in a libc tbl is leaked sym = sym % sym.puts @ leak aka: adjust offsets such that the known point is at the base, then move that base to the known location. The way in which this is different to what you would end up with before is that previously, following a map(abs, off) the base of the table would be accurately valued according to the known information. Now, the 'base' is considered to be the leaked value, but internal offsets are shifted such that they still resolve correctly. Finally, a few new pieces of functionality are added to build out the container API: - symbol key deletion - iteration over symbol:offset pairs - can now check for symbol existence with the "in" keyword - len(symtbl) returns the number of symbols defined Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-24symtbl: Rename file to match class nameMalfurious4-4/+4
I assume that the preferred style is to leave one major class each to a file. In this case, synchronize the names of the Symtbl class and its containing module. Per PEP8, the module is lowercase, and the class remains Pascal case. If other memory-oriented utilities are introduced in the future, we may wish to move them, as well as Symtbl, back into a subpackage named 'mem'. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-23Add the version to the splash screendusoleil1-1/+2
Print the current version (sourced from git describe) when sploit starts up. Signed-off-by: dusoleil <howcansocksbereal@gmail.com> Reviewed-by: Malfurious <m@lfurio.us>
2023-02-23Dynamically source version in toml from gitdusoleil4-5/+54
Instead of hard-coding the version into the pyproject.toml, we can dynamically source it at build time. Ideally, we want to use git describe as a single authority source on the version. The version is stored in sploit.__version__ and can be consumed during sploit runtime or during a build/package to populate the project's core metadata version in the toml file. hatchling provides a tool.hatch.version plugin that can read out the variable during a build/package. Because this variable is populated from a git command, if the source tree isn't in a git repo, it will fail. In this case, sploit will report a PEP 440 compliant fake version "0+unknown.version" to let the user know. Because a packaged distribution doesn't exist in a git repo, we want to bake in the version at build time into the package. hatchling provides a plugin to help with this, but it had some technical limitations that didn't quite work for our use case. Instead, I added a custom build hook which will take the version sourced from the package (and by proxy the git command), and overwrite the __init__.py with a hard-coded version in the __version__ variable. This means that built/packaged distributions of this project will have a fixed version hard-coded in rather than dynamically sourcing from git. The build hook operates just before the build executes. It seems that most build/packager front-ends (e.g. build, pip) will just run it in the current source tree rather than making a temp copy. This means that when we modify the __init__.py, it is modifying our git tree. Ideally, we want this to be restored at the end of the build. The build hook interface allows us to write a hook that happens after the build, but it won't run in the case of a crash or failed build. Instead, I added a custom solution to this using a member variable deconstructor. If the build ends in any way, the original contents of __init__.py are written back out. Signed-off-by: dusoleil <howcansocksbereal@gmail.com> Reviewed-by: Malfurious <m@lfurio.us>
2023-02-23Update project's build and package to the newer standarddusoleil3-7/+23
Currently, the standard way to build and package a Python project is through a pyproject.toml file rather than the old setup.py. This is also build back-end agnostic and we can choose to use something other than setuptools. After looking through a few options, I've decided to use hatchling. Signed-off-by: dusoleil <howcansocksbereal@gmail.com> Reviewed-by: Malfurious <m@lfurio.us>
2023-02-18comm: Localize stdin nonblock to interact's readalldusoleil1-4/+6
In interact(), we set stdin to be nonblocking for the duration of the function. As an unexpected side-effect, this was setting stdout to be nonblocking as well. This has caused at least one crash in the past. Localizing the nonblock to just when we're reading from stdin should solve this. Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-18Use buffered read throughout Commdusoleil1-1/+1
We had originally decided to use the os.read() function instead of the actual buffered file object's read function. This was due to the blocking behavior or os.read() being closer to POSIX read than the other function. As it turns out, os.read() is an unbuffered read. Every other read call in this interface is buffered. This causes some undefined behavior in certain cases and leads to some really confusing bugs. After some discussion, we've decided that, in this application's domain, the blocking behavior of the buffered file object's read is actually often more useful anyways. Changing this call will deal with both issues. Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-18Read once at the beginning of interact mode.dusoleil1-0/+1
This behavior was accidentally removed in dcba5f2 interact mode works by polling for IO events, but it will miss any unread data already in the buffer when it is first entered. We can ensure this gets caught by just doing a read once at the beginning. Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-18comm: Strip \n character from readline()Malfurious1-1/+4
Line-oriented reads now strip the newline from the end of their returned string. Additionally, readall() strips the newline, but only from the string that gets logged to the user's terminal (goodbye to all the "\n" printed at the end of each line). Of course, these functions are called by other parts of the read API and have downstream effects. Consideration was given to the entire API with these rules in mind: - Raw reads (or non-line-oriented reads) will not filter ANY of their read content. They are logged to the screen as one "line" of log text with \n characters shown in-place (not actually resetting the terminal cursor). If reading binary, these bytes dont actually mean line termination anyway. functions: read, readall(_nonblock) *, readuntil - Line-oriented reads will strip the terminating \n, log the single line to the screen, and return it. functions: readline, readlineuntil ** * readall(_nonblock) functions turn out to be a special case. They will operate as raw reads, returning a blob of content. However, we generally want to run them on line-oriented input, so they log according to the line-oriented rules. ** Although content returned from readlineuntil will have \n's stripped, the lines are returned in an array, so we can still distinguish them. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-18comm: Add default argument for writeline()Malfurious1-1/+1
The writeline function will now default to send an empty line when called without an argument. I don't believe any such default makes sense for the plain write function, as writing nothing should have no effect. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-18comm: Enable logonread during interact()Malfurious1-0/+3
This is normally not an issue, since logonread defaults to True. However, if the user disables this setting, interact() becomes a lot less useful. logonread is now forced on during io.interact(), but respected through the rest of the API. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-18comm: Squelch BrokenPipeError during shutdown()Malfurious1-1/+4
Failure to close target stdout is not interesting. Furthermore, if sploit ever gets into this situation, the user script has likely already raised a more useful error/backtrace. Handling this exception typically results in a duplicate error. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-18Always shutdown comms after executing scriptMalfurious1-2/+3
Moving this io cleanup code to the finally block allows it to also run when recovering from an exception. This prevents cases where the target may hang if the user sploit script crashes, and avoids requiring the user to press an additonal CTRL-C to move on. Signed-off-by: Malfurious <m@lfurio.us> Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-12Version 0.1v0.1dusoleil1-1/+1
Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2023-02-12Add .gitignore, README, and UNLICENSEdusoleil3-0/+91
Signed-off-by: dusoleil <howcansocksbereal@gmail.com>
2022-09-12Merge branch 'sploit/symtbl-base'Malfurious2-8/+15
This branch brings some conveniences to the semantics behind Symtbl base values. * sploit/symtbl-base: sploit: rev: Properly base Symtbls for non-PIC binaries sploit: Fix bugs involving Symtbl base value sploit: mem: Allow Symtbl base to be modified