<small><i>This notebook was prepared by [Donne Martin](http://donnemartin.com). Source and license info is on [GitHub](https://bit.ly/code-notes).</i></small>

## Problem: Compress a string such that 'AAABCCDDDD' becomes 'A3B1C2D4'

* [Constraints and Assumptions](#Constraints-and-Assumptions)
* [Test Cases](#Test-Cases)
* [Algorithm: List](#Algorithm:-List)
* [Code: List](#Code:-List)
* [Algorithm: Byte Array](#Algorithm:-Byte-Array)
* [Code: Byte array](#Code:-Byte-Array)
* [Unit Test](#Unit-Test)

## Constraints

*Problem statements are often intentionally ambiguous.  Identifying constraints and stating assumptions can help to ensure you code the intended solution.*

* Can I assume the string is ASCII?
    * Yes
    * Note: Unicode strings could require special handling depending on your language
* Can you use additional data structures?  
    * Yes
* Is this case sensitive?
    * Yes
* Do you compress even if it doesn't save space?
    * No

## Test Cases

* NULL
* '' -> ''
* 'ABC' -> 'ABC'
* 'AAABCCDDDD' -> 'A3B1C2D4'

## Algorithm: List

Since Python strings are immutable, we'll use a list of characters instead to exercise in-place string manipulation as you would get with a C string (which is null terminated, as seen in the diagram below).  Python does not use a null-terminator.

![alt text](https://raw.githubusercontent.com/donnemartin/algorithms-data-structures/master/images/compress_string.jpg)

* Calculate the size of the compressed string
* If the compressed string size is >= string size, return string
* Create compressed_string
    * For each char in string
        * If char is the same as last_char, increment count
        * Else
            * Append last_char to compressed_string
            * append count to compressed_string
            * count = 1
            * last_char = char
        * Append last_char to compressed_string
        * Append count to compressed_string
    * Return compressed_string

Complexity:
* Time: O(n)
* Space: O(2m) where m is the size of the compressed list and the resulting string copied from the list

## Code: List

In [1]:
def compress_string(string):
    if string is None or len(string) == 0:
        return string
    
    # Calculate the size of the compressed string
    size = 0
    last_char = string[0]
    for char in string:
        if char != last_char:
            size += 2
            last_char = char
    size += 2
    
    # If the compressed string size is greater than 
    # or equal to string size, return string
    if size >= len(string):
        return string

    # Create compressed_string
    compressed_string = list()
    count = 0
    last_char = string[0]
    for char in string:
        if char == last_char:
            count += 1
        else:
            compressed_string.append(last_char)
            compressed_string.append(str(count))
            count = 1
            last_char = char
    compressed_string.append(last_char)
    compressed_string.append(str(count))
    return "".join(compressed_string)

## Algorithm: Byte Array

The byte array algorithm similar when using a list, except we will need to work with the bytearray's character codes instead of the characters as we did above when we implemented this solution with a list.

Complexity:
* Time: O(n)
* Space: O(m) where m is the size of the compressed bytearray

## Code: Byte Array

In [2]:
def compress_string_alt(string):
    if string is None or len(string) == 0:
        return string
    
    # Calculate the size of the compressed string
    size = 0
    last_char_code = string[0]
    for char_code in string:
        if char_code != last_char_code:
            size += 2
            last_char_code = char_code
    size += 2
    
    # If the compressed string size is greater than 
    # or equal to string size, return string    
    if size >= len(string):
        return string
    
    # Create compressed_string
    compressed_string = bytearray(size)
    pos = 0
    count = 0
    last_char_code = string[0]
    for char_code in string:
        if char_code == last_char_code:
            count += 1
        else:
            compressed_string[pos] = last_char_code
            compressed_string[pos+1] = ord(str(count))
            pos += 2
            count = 1
            last_char_code = char_code
    compressed_string[pos] = last_char_code
    compressed_string[pos+1] = ord(str(count))
    return compressed_string

## Unit Test

*It is important to identify and run through general and edge cases from the [Test Cases](#Test-Cases) section by hand.  You generally will not be asked to write a unit test like what is shown below.*

In [3]:
from nose.tools import assert_equal

class Test(object):
    def test_compress(self, func):
        assert_equal(func(None), None)
        assert_equal(func(''), '')
        assert_equal(func('ABC'), 'ABC')
        assert_equal(func('AAABCCDDDD'), 'A3B1C2D4')
        print('Success: test_compress')

if __name__ == '__main__':
    test = Test()
    test.test_compress(compress_string)
    test.test_compress(compress_string_alt)

Success: test_compress
Success: test_compress
