compressor package


Module contents

compressor entry point

lib module


High-level functions exposed as a library, that can be imported.

compressor.lib.compress_file(filename: str, dest_file: str = '') → None[source]

Open the <filename> and compress its contents on a new one.

  • filename (str) – The path to the source file to compress.
  • dest_file (str) – The name of the target file. If not provided (None), a default will be used with <filename>.comp

cli module

Compressor CLI (command-line interface) module. Exposes the entry point to the program for executing as command line.

compressor.cli.argument_parser() → argparse.ArgumentParser[source]

Create the argument parser object to be used for parsing the arguments from sys.argv

compressor.cli.main() → int[source]

Program cli

Returns:Status code of the program.
Return type:int
compressor.cli.main_engine(filename: str, extract: bool = False, compress: bool = True, dest_file=None) → int[source]

Main functionality for the program cli or call as library. extract & compress must have opposite values.

Return type:


  • filename (str) – Path to the source file to process.
  • extract (bool) – If True, sets the program for a extraction.
  • compress (bool) – If True, the program should compress a file.
  • dest_file – Optional name of the target file.

0 if executed without problems.

compressor.cli.parse_arguments(args=None) → dict[source]

Parse the command-line (cli) provided arguments, and return a mapping of the options selected by the user with their values.

Returns:dict with the kwargs provided in cli

compressor.core module


Low-level functionality with the core of the process that the main program makes use of.

It contains auxiliary functions.

class compressor.core.CharNode(value, freq, left=None, right=None)[source]

Bases: object

Object that wraps/encapsulates the definition of a character in the text being processed. Used for comparison, and helper with its properties & methods.


Checks if the current node is a leaf in the tree. It is a leaf when it does not have any children (neither left nor right).

Returns:True if this node has no children, False otherwise.

Expose the value being hold as read-only.

compressor.core.compress_and_save_content(input_filename: str, output_file: io, table: dict) → None[source]

Opens and processes <input_filename>. Iterates over the file and writes the contents on output_file.

  • input_filename (str) – the source to be compressed
  • output_file (io) – opened file where to write the outcome
  • table (dict) – mapping table for the char encoding
compressor.core.create_tree_code(charset: List[compressor.core.CharNode]) → compressor.core.CharNode[source]

Receives a :list: of :CharNode: (characters) charset, namely leaves in the tree, and returns a tree with the corresponding prefix-free code.

Return type:CharNode
Parameters:charset – iterable with all the characters to process.
Returns:iterable with a tree of the prefix-free code for the charset.
compressor.core.decode_file_content(compfile: io, table: dict, checksum: int) → str[source]

Reconstruct the remaining part of the <compfile>, starting right after the metadata, decoding each bit according to the <table>.

compressor.core.parse_tree_code(tree: compressor.core.CharNode, table: dict = None, code: bytes = b'') → dict[source]

Given the tree with the chars-frequency processed, return a table that maps each character with its binary representation on the new code:

left –> 0

right –> 1

Return type:


  • tree (CharNode) – iterable with the tree as returned by create_tree_code
  • table (dict) – Map with the translation for the characters to its code in the new system (prefix-free).
  • code (bytes) – The code prefix so far.

Mapping with with the original char to its new code.

compressor.core.process_frequencies(stream: Sequence[str]) → List[compressor.core.CharNode][source]

Given a stream of text, return a list of CharNode with the frequencies for each character.

Parameters:stream – sequence with all the characters.
compressor.core.process_line_compression(buffer_line: str, output_file: io, table: dict) → None[source]

Transform buffer_line into the new code, per-byte, based on table and save the new byte-stream into output_file.

  • buffer_line (str) – a chunk of the text to process.
  • output_file (io) – The opened file where to write the result.
  • table (dict) – Translation table for the characters in buffer_line.
compressor.core.retrieve_compressed_file(filename: str, dest_file: str = '') → None[source]

EXTRACT - Reconstruct the original file from the compressed copy. Write the output in the indicated dest_file.

compressor.core.retrieve_table(dest_file: io) → dict[source]

Read the binary file, and return the translation table as a reversed dictionary.

compressor.core.save_compressed_file(filename: str, table: dict, checksum: int, dest_file: str = '') → None[source]

Given the original file by its filename, save a new one. table contains the new codes for each character on filename.

compressor.core.save_table(dest_file: io, table: dict) → None[source]
Store the table in the destination file.
c: char L: code of c (unsigned Long)
  • dest_file (io) – opened file where to write the table.
  • table (dict) – Mapping table with the chars and their codes.

functions module