computer vision

All posts tagged computer vision

As part of a project, I’m trying to learn how to do motion capture on videos. Fortunately, there’s Python support for the OpenCV computer vision library.

I adapted some motion capture code I found online that uses the Shi-Tomasi Corner Detector scheme to find good features to track — regions in a grayscale video frame that have large derivatives in two orthogonal directions.

Then the code estimates the optical flow using the Lucas-Kanade method, which applies a least-squares fit to solve for the two-dimensional velocity vector of the corner features.

As a test case, I used a video of Alice singing the “Ito Maki Maki” song.

The shiny tracks in the video show the best-fit model. Interestingly, the corner detection scheme chooses to follow the glints in her eyes and on her lip. The motion tracker does a good job following the glints until she blinks and swings her arm across her face.

The code I used is posted below.

import numpy as np
import cv2
cap = cv2.VideoCapture('IMG_0986.mov')
size = (int(cap.get(cv2.cv.CV_CAP_PROP_FRAME_WIDTH)),
int(cap.get(cv2.cv.CV_CAP_PROP_FRAME_HEIGHT)))

# params for ShiTomasi corner detection
feature_params = dict( maxCorners = 100,
qualityLevel = 0.3,
minDistance = 7,
blockSize = 7 )

# Parameters for lucas kanade optical flow
lk_params = dict( winSize = (15,15),
maxLevel = 2,
criteria = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03))

# Create some random colors
color = np.random.randint(0,255,(100,3))

# Take first frame and find corners in it
ret, frame = cap.read()
old_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
p0 = cv2.goodFeaturesToTrack(old_gray, mask = None, **feature_params)

# Create a mask image for drawing purposes
mask = np.zeros_like(frame)

images = list()

height , width , layers = frame.shape

fourcc = cv2.cv.CV_FOURCC('m', 'p', '4', 'v')
video = cv2.VideoWriter()
success = video.open('Alice_singing.mp4v', fourcc, 15.0, size, True)

ret = True
while(ret):
  print(ret)

  frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# calculate optical flow
  p1, st, err = cv2.calcOpticalFlowPyrLK(old_gray, frame_gray, p0, None, **lk_params)

# Select good points
  good_new = p1[st==1]
  good_old = p0[st==1]

# draw the tracks
  for i,(new,old) in enumerate(zip(good_new,good_old)):
    a,b = new.ravel()
    c,d = old.ravel()
    cv2.line(mask, (a,b),(c,d), color[i].tolist(), 2)
    cv2.circle(frame,(a,b),5,color[i].tolist(),-1)

  img = cv2.add(frame,mask)

  video.write(img)
  ret,frame = cap.read()

# Now update the previous frame and previous points
  old_gray = frame_gray.copy()
  p0 = good_new.reshape(-1,1,2)

cap.release()
video.release()
cv2.destroyAllWindows()