summaryrefslogtreecommitdiffstats
path: root/README.md
diff options
context:
space:
mode:
authorPaul Sorensen <aedrax@gmail.com>2018-02-14 23:33:35 -0500
committerGitHub <noreply@github.com>2018-02-14 23:33:35 -0500
commit2209d8c538e51daba96e25a287245aa12023b65d (patch)
treeb3f7fe0d8a6b727d3157452338cd5e60936abb59 /README.md
parentde95acfc7e40f0080ad16cc64dfe9470779bb0b9 (diff)
downloadSorensenCompression-2209d8c538e51daba96e25a287245aa12023b65d.tar.gz
SorensenCompression-2209d8c538e51daba96e25a287245aa12023b65d.zip
Update README.md
Diffstat (limited to 'README.md')
-rw-r--r--README.md24
1 files changed, 23 insertions, 1 deletions
diff --git a/README.md b/README.md
index b6e3148..106cef4 100644
--- a/README.md
+++ b/README.md
@@ -1 +1,23 @@
-# SorensenCompression \ No newline at end of file
+# Sorensen Compression
+
+The basic idea is that you can generate your own data without really inflating anything. Just by knowing the following information:
+
+1. Total size of the data
+2. Hash of the data
+3. Number of each unique data group occurance in your data
+
+For example, say you have the data bytes "abc"
+You know the length of the file is 3 bytes
+You know the hash (in this example I'll use sha256) is ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
+
+Finally, for the unique data groups, I'll use the example of group of {1} and find the total number of binary 1's in the data. I believe this would be the most difficult to regenerate.
+
+To do this we take our data "abc"
+which in binary would be 01100001 01100010 01100011
+So there will be 10 uinique groups of {1}
+
+So the the total number of possible combinations would be 24 choose 10 which == 1961256 combinations. Out of the total combinations of 2^24 == 16777216 the 1961256 brings the number of hashes to try to only 11.69% of all possible.
+
+![LaTex Image](http://mathurl.com/ycgnob6r.png "quick mafs")
+
+With more bits in a group, the data needed would increase but the computations required would decrease.