**All the writing in this blog that is BLUE is my notes (besides some links)**.

Enumerate "Data" Big Idea from College Board

Some of the big ideas and vocab that you observe, talk about it with a partner ...

  • "Data compression is the reduction of the number of bits needed to represent data"
  • "Data compression is used to save transmission time and storage space."
  • "lossy data can reduce data but the original data is not recovered"
  • "lossless data lets you restore and recover"

The Image Lab Project contains a plethora of College Board Unit 2 data concepts. Working with Images provides many opportunities for compression and analyzing size.

Image Files and Size

Describe some of the meta data and considerations when managing Image files. Describe how these relate to Data Compression ...

- File Type, PNG and JPG are two types used in this lab: File type determines what the result of the image can come out to be. The file type must be able to operate with whatever code it is used in.</p>

- Size, height and width, number of pixels: The size and the number of pixels used results in the image taking up more storage. The larger the image, the more storage it will need to use.</p>

- Visual perception: Some types of images can have different attributes that change the way that the image might be displayed in.</p>

- lossy compression: Lossy compression results in a smaller file meaning less storage space. However, lossy compression does this by removing part of the data in an image resulting in it becoming of less quality.</p> </div> </div> </div>

Displaying images in Python Jupyter notebook

Python Libraries and Concepts used for Jupyter and Files/Directories

IPython

Support visualization of data in Jupyter notebooks. Visualization is specific to View, for the web visualization needs to be converted to HTML.

pathlib

File paths are different on Windows versus Mac and Linux. This can cause problems in a project as you work and deploy on different Operating Systems (OS's), pathlib is a solution to this problem.

  • What are commands you use in terminal to access files?

- cd - dir</p>

- mkdir</p>

- del</p>

- ren</p>

- copy</p>

- move</p>

- edit</p>

  • What are the commands you use in Windows terminal to access files?

The commands listed over are the ones used in windows terminal. I use a windows computer for running my terminal.</p>

  • What are some of the major differences?

There are no major differences because they are the same system. My computer is ran by windows and so is the terminal inside of it.</p>

Provide what you observed, struggled with, or leaned while playing with this code.

  • Why is path a big deal when working with images?

The path used in code when working with images is a big deal because the path is used to pull the data about the image. If the incorrect path is used, the algorithm will not be able to find the data for an image, resulting in the code breaking due to missing data.</p>

  • How does the meta data source and label relate to Unit 5 topics?

Meta data source and label relate to Unit 5 topics because both refer to the use and management of data. The topics in Unit 5 talk about the uses of data and how is itself can be used. Meta data adds an extra layer to that because meta data is data about data. This makes it fall under the Topics in Unit 5 and inplies that the uses of data are the same for the uses of meta data.</p>

  • Look up IPython, describe why this is interesting in Jupyter Notebooks for both Pandas and Images?

IPython is a system that makes the use of Pandas and Images easiler. In Jupyter Notebook, it provides easy access to changing data, easy access to display data and images in the notebook, easy documentation inside code cells, and displaying images and data inside and during the creation of the notebook. To summarize, IPython makes using Jupyter Notebooks and the Pandas and images inside of it easier to use.</p> </div> </div> </div>

from IPython.display import Image, display
from pathlib import Path  # https://medium.com/@ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f

# prepares a series of images
def image_data(path=Path("images/"), images=None):  # path of static images is defaulted
    if images is None:  # default image
        images = [
            {'source': "Peter Carolin", 'label': "Clouds Impression", 'file': "clouds-impression.png"},
            {'source': "Peter Carolin", 'label': "Lassen Volcano", 'file': "lassen-volcano.jpg"}
        ]
    for image in images:
        # File to open
        image['filename'] = path / image['file']  # file with path
    return images

def image_display(images):
    for image in images:  
        display(Image(filename=image['filename']))


# Run this as standalone tester to see sample data printed in Jupyter terminal
if __name__ == "__main__":
    # print parameter supplied image
    green_square = image_data(images=[{'source': "Internet", 'label': "Green Square", 'file': "green-square-16.png"}])
    image_display(green_square)
    
    # display default images from image_data()
    default_images = image_data()
    image_display(default_images)
    

Reading and Encoding Images (2 implementations follow)

PIL (Python Image Library)

Pillow or PIL provides the ability to work with images in Python. Geeks for Geeks shows some ideas on working with images.

base64

Image formats (JPG, PNG) are often called *Binary File formats, it is difficult to pass these over HTTP. Thus, base64 converts binary encoded data (8-bit, ASCII/Unicode) into a text encoded scheme (24 bits, 6-bit Base64 digits). Thus base64 is used to transport and embed binary images into textual assets such as HTML and CSS.- How is Base64 similar or different to Binary and Hexadecimal?

Base64 is similar to Binary and Hexadecimal because they all use a base to the power of an interger to display numbers. Base64 is different to Binary and Hexadecimal because Base64 has a base of 64.</p>

  • Translate first 3 letters of your name to Base64.

Luc --> TA0KdQ0KYw==</p>

numpy

Numpy is described as "The fundamental package for scientific computing with Python". In the Image Lab, a Numpy array is created from the image data in order to simplify access and change to the RGB values of the pixels, converting pixels to grey scale.

io, BytesIO

Input and Output (I/O) is a fundamental of all Computer Programming. Input/output (I/O) buffering is a technique used to optimize I/O operations. In large quantities of data, how many frames of input the server currently has queued is the buffer. In this example, there is a very large picture that lags.

  • Where have you been a consumer of buffering?

When loading new tabs or functions on websites, when downloading files, and when uploading files.</p>

  • From your consumer experience, what effects have you experienced from buffering?

I lost time and/or lost interest in whatever I was doing.</p>

  • How do these effects apply to images?

I could also loss the time to look at images or lost interest in the image if it is not loading.</p> </div> </div> </div>

Data Structures, Imperative Programming Style, and working with Images

Introduction to creating meta data and manipulating images. Look at each procedure and explain the the purpose and results of this program. Add any insights or challenges as you explored this program.

  • Does this code seem like a series of steps are being performed?

Yes, the code is going through diffenent steps before it results in the images turning grey.</p>

  • Describe Grey Scale algorithm in English or Pseudo code?

The algorithm pulls the images and scales the data. It then convert is to base64 in order for the program to change each pixel to its grey style. Finally, it displays the output, scaled and scaled/grey.</p>

  • Describe scale image? What is before and after on pixels in three images?

Each image is scaled to a width of 320 with its height being scaled by the same ratio.</p>

  • Is scale image a type of compression? If so, line it up with College Board terms described?

Scaling an image down is a type of compression because it is removing the amount of pixels in an image. It is also lossy compression.</p> </div> </div> </div>

from IPython.display import HTML, display
from pathlib import Path  # https://medium.com/@ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f
from PIL import Image as pilImage # as pilImage is used to avoid conflicts
from io import BytesIO
import base64
import numpy as np

# prepares a series of images
def image_data(path=Path("images/"), images=None):  # path of static images is defaulted
    if images is None:  # default image
        images = [
            {'source': "Internet", 'label': "Green Square", 'file': "green-square-16.png"},
            {'source': "Peter Carolin", 'label': "Clouds Impression", 'file': "clouds-impression.png"},
            {'source': "Peter Carolin", 'label': "Lassen Volcano", 'file': "lassen-volcano.jpg"},
            {'source': "Internet", 'label': "Smile face", 'file': "smileface.jpg"}
        ]
    for image in images:
        # File to open
        image['filename'] = path / image['file']  # file with path
    return images

# Large image scaled to baseWidth of 320
def scale_image(img):
    baseWidth = 320
    scalePercent = (baseWidth/float(img.size[0]))
    scaleHeight = int((float(img.size[1])*float(scalePercent)))
    scale = (baseWidth, scaleHeight)
    return img.resize(scale)

# PIL image converted to base64
def image_to_base64(img, format):
    with BytesIO() as buffer:
        img.save(buffer, format)
        return base64.b64encode(buffer.getvalue()).decode()

# Set Properties of Image, Scale, and convert to Base64
def image_management(image):  # path of static images is defaulted        
    # Image open return PIL image object
    img = pilImage.open(image['filename'])
    
    # Python Image Library operations
    image['format'] = img.format
    image['mode'] = img.mode
    image['size'] = img.size
    # Scale the Image
    img = scale_image(img)
    image['pil'] = img
    image['scaled_size'] = img.size
    # Scaled HTML
    image['html'] = '<img src="data:image/png;base64,%s">' % image_to_base64(image['pil'], image['format'])
    
# Create Grey Scale Base64 representation of Image
def image_management_add_html_grey(image):
    # Image open return PIL image object
    img = image['pil']
    format = image['format']
    
    img_data = img.getdata()  # Reference https://www.geeksforgeeks.org/python-pil-image-getdata/
    image['data'] = np.array(img_data) # PIL image to numpy array
    image['gray_data'] = [] # key/value for data converted to gray scale

    # 'data' is a list of RGB data, the list is traversed and hex and binary lists are calculated and formatted
    for pixel in image['data']:
        # create gray scale of image, ref: https://www.geeksforgeeks.org/convert-a-numpy-array-to-an-image/
        average = (pixel[0] + pixel[1] + pixel[2]) // 3  # average pixel values and use // for integer division
        if len(pixel) > 3:
            image['gray_data'].append((average, average, average, pixel[3])) # PNG format
        else:
            image['gray_data'].append((average, average, average))
        # end for loop for pixels
        
    img.putdata(image['gray_data'])
    image['html_grey'] = '<img src="data:image/png;base64,%s">' % image_to_base64(img, format)


# Jupyter Notebook Visualization of Images
if __name__ == "__main__":
    # Use numpy to concatenate two arrays
    images = image_data()
    
    # Display meta data, scaled view, and grey scale for each image
    for image in images:
        image_management(image)
        print("---- meta data -----")
        print(image['label'])
        print(image['source'])
        print(image['format'])
        print(image['mode'])
        print("Original size: ", image['size'])
        print("Scaled size: ", image['scaled_size'])
        
        print("-- original image --")
        display(HTML(image['html'])) 
        
        print("--- grey image ----")
        image_management_add_html_grey(image)
        display(HTML(image['html_grey'])) 
    print()
---- meta data -----
Green Square
Internet
PNG
RGBA
Original size:  (16, 16)
Scaled size:  (320, 320)
-- original image --
--- grey image ----
---- meta data -----
Clouds Impression
Peter Carolin
PNG
RGBA
Original size:  (320, 234)
Scaled size:  (320, 234)
-- original image --
--- grey image ----
---- meta data -----
Lassen Volcano
Peter Carolin
JPEG
RGB
Original size:  (2792, 2094)
Scaled size:  (320, 240)
-- original image --
--- grey image ----
---- meta data -----
Smile face
Internet
JPEG
RGB
Original size:  (223, 226)
Scaled size:  (320, 324)
-- original image --
--- grey image ----

Data Structures and OOP

Most data structures classes require Object Oriented Programming (OOP). Since this class is lined up with a College Course, OOP will be talked about often. Functionality in remainder of this Blog is the same as the prior implementation. Highlight some of the key difference you see between imperative and oop styles.

  • Read imperative and object-oriented programming on Wikipedia
  • Consider how data is organized in two examples, in relations to procedures
  • Look at Parameters in Imperative and Self in OOP

Additionally, review all the imports in these three demos. Create a definition of their purpose, specifically these ...

- PIL: PIL is refered to in this algorthm because PIL is the place where "Image" comes from. - numpy: Numpy is used to make an array of data out of an image in order for that image to have pixels with different RGB values. In this case, the RGB values would be changed to result in a grey style.</p>

- base64: base64 allows the pixels to be changed to the grey style.</p> </div> </div> </div>

from IPython.display import HTML, display
from pathlib import Path  # https://medium.com/@ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f
from PIL import Image as pilImage # as pilImage is used to avoid conflicts
from io import BytesIO
import base64
import numpy as np


class Image_Data:

    def __init__(self, source, label, file, path, baseWidth=320):
        self._source = source    # variables with self prefix become part of the object, 
        self._label = label
        self._file = file
        self._filename = path / file  # file with path
        self._baseWidth = baseWidth

        # Open image and scale to needs
        self._img = pilImage.open(self._filename)
        self._format = self._img.format
        self._mode = self._img.mode
        self._originalSize = self.img.size
        self.scale_image()
        self._html = self.image_to_html(self._img)
        self._html_grey = self.image_to_html_grey()


    @property
    def source(self):
        return self._source  
    
    @property
    def label(self):
        return self._label 
    
    @property
    def file(self):
        return self._file   
    
    @property
    def filename(self):
        return self._filename   
    
    @property
    def img(self):
        return self._img
             
    @property
    def format(self):
        return self._format
    
    @property
    def mode(self):
        return self._mode
    
    @property
    def originalSize(self):
        return self._originalSize
    
    @property
    def size(self):
        return self._img.size
    
    @property
    def html(self):
        return self._html
    
    @property
    def html_grey(self):
        return self._html_grey
        
    # Large image scaled to baseWidth of 320
    def scale_image(self):
        scalePercent = (self._baseWidth/float(self._img.size[0]))
        scaleHeight = int((float(self._img.size[1])*float(scalePercent)))
        scale = (self._baseWidth, scaleHeight)
        self._img = self._img.resize(scale)
    
    # PIL image converted to base64
    def image_to_html(self, img):
        with BytesIO() as buffer:
            img.save(buffer, self._format)
            return '<img src="data:image/png;base64,%s">' % base64.b64encode(buffer.getvalue()).decode()
            
    # Create Grey Scale Base64 representation of Image
    def image_to_html_grey(self):
        img_grey = self._img
        numpy = np.array(self._img.getdata()) # PIL image to numpy array
        
        grey_data = [] # key/value for data converted to gray scale
        # 'data' is a list of RGB data, the list is traversed and hex and binary lists are calculated and formatted
        for pixel in numpy:
            # create gray scale of image, ref: https://www.geeksforgeeks.org/convert-a-numpy-array-to-an-image/
            average = (pixel[0] + pixel[1] + pixel[2]) // 3  # average pixel values and use // for integer division
            if len(pixel) > 3:
                grey_data.append((average, average, average, pixel[3])) # PNG format
            else:
                grey_data.append((average, average, average))
            # end for loop for pixels
            
        img_grey.putdata(grey_data)
        return self.image_to_html(img_grey)

        
# prepares a series of images, provides expectation for required contents
def image_data(path=Path("images/"), images=None):  # path of static images is defaulted
    if images is None:  # default image
        images = [
            {'source': "Internet", 'label': "Green Square", 'file': "green-square-16.png"},
            {'source': "Peter Carolin", 'label': "Clouds Impression", 'file': "clouds-impression.png"},
            {'source': "Peter Carolin", 'label': "Lassen Volcano", 'file': "lassen-volcano.jpg"},
            {'source': "Internet", 'label': "Smile face", 'file': "smileface.jpg"}
        ]
    return path, images

# turns data into objects
def image_objects():        
    id_Objects = []
    path, images = image_data()
    for image in images:
        id_Objects.append(Image_Data(source=image['source'], 
                                  label=image['label'],
                                  file=image['file'],
                                  path=path,
                                  ))
    return id_Objects

# Jupyter Notebook Visualization of Images
if __name__ == "__main__":
    for ido in image_objects(): # ido is an Imaged Data Object
        
        print("---- meta data -----")
        print(ido.label)
        print(ido.source)
        print(ido.file)
        print(ido.format)
        print(ido.mode)
        print("Original size: ", ido.originalSize)
        print("Scaled size: ", ido.size)
        
        print("-- scaled image --")
        display(HTML(ido.html))
        
        print("--- grey image ---")
        display(HTML(ido.html_grey))
        
    print()
---- meta data -----
Green Square
Internet
green-square-16.png
PNG
RGBA
Original size:  (16, 16)
Scaled size:  (320, 320)
-- scaled image --
--- grey image ---
---- meta data -----
Clouds Impression
Peter Carolin
clouds-impression.png
PNG
RGBA
Original size:  (320, 234)
Scaled size:  (320, 234)
-- scaled image --
--- grey image ---
---- meta data -----
Lassen Volcano
Peter Carolin
lassen-volcano.jpg
JPEG
RGB
Original size:  (2792, 2094)
Scaled size:  (320, 240)
-- scaled image --
--- grey image ---
---- meta data -----
Smile face
Internet
smileface.jpg
JPEG
RGB
Original size:  (223, 226)
Scaled size:  (320, 324)
-- scaled image --
--- grey image ---

Hacks

  • Choose 2 images, one that will more likely result in lossy data compression and one that is more likely to result in lossless data compression. Explain.

The first image would result in lossy data compression because there is a lot of details that could be easily simplify and reduced down and the quality would not change much. The second image would result in lossless data compression because the details in the graph shouldn't be reduced as it might accidentally create a result displaying wrong numerical values.</p> </div> </div> </div>

from IPython.display import HTML, display
from pathlib import Path
from PIL import Image as pilImage
import numpy as np

def image_data(path=Path("images/"), images=None):
    if images is None:
        images = [{'source': "Internet", 'label': "Smile face", 'file': "smileface.jpg"}]
    for image in images:
        image['filename'] = path / image['file']
    return images

def image_management(image):       
    # Image open return PIL image object
    img = pilImage.open(image['filename'])
    # Python Image Library operations
    image['format'] = img.format
    image['mode'] = img.mode
    image['pil'] = img
    image['html'] = '<img src="data:image/png;base64,%s">' % image_to_base64(image['pil'], image['format'])

def image_management_add_html_red(image):
    img = image['pil']
    format = image['format']
    img_data = img.getdata()
    image['data'] = np.array(img_data)
    image['red_data'] = []
    for pixel in image['data']:
        red = pixel[0]
        if len(pixel) > 3:
            image['red_data'].append((red, 0, 0, pixel[3]))
        else:
            image['red_data'].append((red, 0, 0))
    img.putdata(image['red_data'])
    image['html_red'] = '<img src="data:image/png;base64,%s">' % image_to_base64(img, format)

def image_management_add_html_green(image):
    img = image['pil']
    format = image['format']
    img_data = img.getdata()
    image['data'] = np.array(img_data)
    image['green_data'] = []
    for pixel in image['data']:
        green = pixel[0]
        if len(pixel) > 3:
            image['green_data'].append((0, green, 0, pixel[3]))
        else:
            image['green_data'].append((0, green, 0))
    img.putdata(image['green_data'])
    image['html_green'] = '<img src="data:image/png;base64,%s">' % image_to_base64(img, format)

def image_management_add_html_blue(image):
    img = image['pil']
    format = image['format']
    img_data = img.getdata()
    image['data'] = np.array(img_data)
    image['blue_data'] = []
    for pixel in image['data']:
        blue = pixel[1]
        if len(pixel) > 3:
            image['blue_data'].append((0, 0, blue, pixel[3]))
        else:
            image['blue_data'].append((0, 0, blue))
    img.putdata(image['blue_data'])
    image['html_blue'] = '<img src="data:image/png;base64,%s">' % image_to_base64(img, format)

if __name__ == "__main__":
    images = image_data()
    for image in images:
        image_management(image)
        print("-- original image --")
        display(HTML(image['html']))
        print("-- red image --")
        image_management_add_html_red(image)
        display(HTML(image['html_red']))
        print("--- green image ----")
        image_management_add_html_green(image)
        display(HTML(image['html_green']))
        print("--- blue image ----")
        image_management_add_html_blue(image)
        display(HTML(image['html_blue']))
-- original image --
-- red image --
--- green image ----
--- blue image ----
</div>