Otary library, shape your images, image your shapes.
Welcome to Otary
Otary — elegant, readable, and powerful image and 2D geometry Python library.
Features
The main features of Otary are:
-
Unification: Otary offers a cohesive solution for image and geometry manipulation, letting you work seamlessly without switching tools.
-
Readability: Self-explanatory by design. Otary’s clean, readable code eliminates the need for comments, making it easy for beginners to learn and for experts to build efficiently.
-
Performance: optimized for speed and efficiency, making it suitable for high-performance applications. It is built on top of NumPy and OpenCV, which are known for their speed and performance.
-
Interactivity: designed to be Interactive and user-friendly, ideal for Jupyter notebooks and live exploration.
-
Flexibility: provides a flexible and extensible architecture, allowing developers to customize and extend its functionality as needed.
Example
Let me illustrate the usage of Otary with a simple example. Imagine you need to:
- read an image from a pdf file
- draw an ellipse on it
- crop a part of the image
- rotate the cropped image
- apply a threshold
- show the image
In order to compare the use of Otary versus other libraries, I will use the same example but with different libraries. Try it yourself on your favorite LLM (like ChatGPT) by copying the query:
Generate a python code to read an image from a pdf, draw an ellipse on it, crop a part of the image, rotate the cropped image, apply a threshold on the image.
Using Otary you can do it with few lines of code:
import otary as ot
im = ot.Image.from_pdf("path/to/your/file.pdf", page_nb=0)
ellipse = ot.Ellipse(foci1=[100, 100], foci2=[400, 400], semi_major_axis=250)
im = (
im.draw_ellipses([ellipse])
.crop(x0=50, y0=50, x1=450, y1=450)
.rotate(angle=90, is_degree=True)
.threshold_simple(thresh=200)
)
im.show()
"""
Providing the input to ChatGPT gives the following code
"""
import fitz # PyMuPDF
import numpy as np
import cv2
def read_image_from_pdf(pdf_path, page_number=0, dpi=300):
"""Extracts the specified page as an image from a PDF."""
doc = fitz.open(pdf_path)
page = doc[page_number]
mat = fitz.Matrix(dpi / 72, dpi / 72) # scale to DPI
pix = page.get_pixmap(matrix=mat)
img = np.frombuffer(pix.samples, dtype=np.uint8).reshape(pix.height, pix.width, pix.n)
if img.shape[2] == 4:
img = cv2.cvtColor(img, cv2.COLOR_BGRA2BGR)
return img
def draw_ellipse(img, center, axes, angle=0, color=(0, 255, 0), thickness=2):
"""Draws an ellipse on the image."""
return cv2.ellipse(img.copy(), center, axes, angle, 0, 360, color, thickness)
def crop_image(img, top_left, bottom_right):
"""Crops the image using top-left and bottom-right coordinates."""
x1, y1 = top_left
x2, y2 = bottom_right
return img[y1:y2, x1:x2]
def rotate_image(img, angle):
"""Rotates the image around its center by the given angle."""
(h, w) = img.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(img, M, (w, h), flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REPLICATE)
return rotated
def apply_threshold(img, thresh_value=127):
"""Applies a binary threshold on the grayscale version of the image."""
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(gray, thresh_value, 255, cv2.THRESH_BINARY)
return thresh
def main():
pdf_path = "your_file.pdf"
# Step 1: Read image from PDF
img = read_image_from_pdf(pdf_path)
# Step 2: Draw an ellipse on the image
h, w = img.shape[:2]
center = (w // 2, h // 2)
axes = (w // 4, h // 6)
img_with_ellipse = draw_ellipse(img, center, axes, angle=30, color=(0, 0, 255), thickness=3)
# Step 3: Crop a part of the image
cropped_img = crop_image(img_with_ellipse, (100, 100), (500, 500))
# Step 4: Rotate the cropped image
rotated_img = rotate_image(cropped_img, angle=45)
# Step 5: Apply threshold
thresholded_img = apply_threshold(rotated_img, thresh_value=150)
# Display results
cv2.imshow("Ellipse Image", img_with_ellipse)
cv2.imshow("Cropped Image", cropped_img)
cv2.imshow("Rotated Image", rotated_img)
cv2.imshow("Thresholded Image", thresholded_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Optionally save results
cv2.imwrite("ellipse_image.jpg", img_with_ellipse)
cv2.imwrite("cropped_image.jpg", cropped_img)
cv2.imwrite("rotated_image.jpg", rotated_img)
cv2.imwrite("thresholded_image.jpg", thresholded_img)
if __name__ == "__main__":
main()
ChatGPT proposes to re-invent the wheel.
Using Otary makes the code:
- Much more readable and hence maintainable
- Much more interactive
- Much simpler, simplifying libraries management by only using one library and not manipulating multiple libraries like Pillow, OpenCV, Scikit-Image, PyMuPDF etc.