Real time Drone object tracking using Python and OpenCV

Reading Time: 2 minutes

After flying this past weekend (together with Gabriel and Leandro) with Gabriel’s drone (which is an handmade APM 2.6 based quadcopter) in our town (Porto Alegre, Brasil), I decided to implement a tracking for objects using OpenCV and Python and check how the results would be using simple and fast methods like Meanshift. The result was very impressive and I believe that there is plenty of room for optimization, but the algorithm is now able to run in real time using Python with good results and with a Full HD resolution of 1920×1080 and 30 fps.

Here is the video of the flight that was piloted by Gabriel:

See it in Full HD for more details.

The algorithm can be described as follows and it is very simple (less than 50 lines of Python) and straightforward:

  • A ROI (Region of Interest) is defined, in this case the building that I want to track
  • The normalized histogram and back-projection are calculated
  • The Meanshift algorithm is used to track the ROI

The entire code for the tracking is described below:

import numpy as np
import cv2

def run_main():
    cap = cv2.VideoCapture('upabove.mp4')

    # Read the first frame of the video
    ret, frame =

    # Set the ROI (Region of Interest). Actually, this is a
    # rectangle of the building that we're tracking
    c,r,w,h = 900,650,70,70
    track_window = (c,r,w,h)

    # Create mask and normalized histogram
    roi = frame[r:r+h, c:c+w]
    hsv_roi = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)
    mask = cv2.inRange(hsv_roi, np.array((0., 30.,32.)), np.array((180.,255.,255.)))
    roi_hist = cv2.calcHist([hsv_roi], [0], mask, [180], [0, 180])
    cv2.normalize(roi_hist, roi_hist, 0, 255, cv2.NORM_MINMAX)
    term_crit = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 80, 1)
    while True:
        ret, frame =

        hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
        dst = cv2.calcBackProject([hsv], [0], roi_hist, [0,180], 1)

        ret, track_window = cv2.meanShift(dst, track_window, term_crit)

        x,y,w,h = track_window
        cv2.rectangle(frame, (x,y), (x+w,y+h), 255, 2)
        cv2.putText(frame, 'Tracked', (x-25,y-10), cv2.FONT_HERSHEY_SIMPLEX,
            1, (255,255,255), 2, cv2.CV_AA)
        cv2.imshow('Tracking', frame)

        if cv2.waitKey(1) & 0xFF == ord('q'):


if __name__ == "__main__":

I hope you liked it !


  1. will this be as accurate using the fpv video? standard pal resolution? it will be incredible more usefull to be able to track and detect from a live feed and post processing after a flight.

  2. This inspires me. I’m joining a few guys who are engineering software for drones and I only have Java / backend experience. This makes me believe I can have some fun with them! 🙂 Thanks

    1. I don’t think the raspberry pi could do this frame rate. I am also really interested in knowing which hardware was used!

      1. Olá! Você poderia me dar alguma dica pra um projeto escolar?

        Ele consiste em um drone (F450) com a controladora APM 2.6. O qual precisa passar por um frame (janela) que pode ser de qualquer forma geométrica (isso tem que ser de forma autônoma). Pensei em utilizar uma câmera e um raspberry pi para fazer o reconhecimento da forma geométrica e achar o seu centro para que a aeronave possa passar no meio dela.

        Obrigado pela atenção 😉

    2. Hello Thomas, we’re using an APM 2.6 board for the Drone, but the video was recorded using GoPro and the processing was done by a desktop computer, but the vidoe was processed in real-time, so it can be used also for an FPV transmission for instance. I also believe that the ARM on the Raspberry Pi has enough resource to process this, but I’m still going to test that.

        1. Did you document a tutorial because I am interested in this project and I would appreciate an advice in how to start.

    1. I’ve ran into this while installing OpenCV. Does anyone know the history/reasoning behind this change? I’m a neophyte to OpenCV.

      1. In opencv version 3.0, a lot of refactoring was done. Due to rising interest, contribution from peoples around the world and with GSOC, the library was getting a lot bigger. Version 3 seems to be the right time to reduce all the technical debt and provide a cleaner and easier design for as you said, neophytes to learn faster.

  3. Do you tested how Meanshift algorithm perform compared with Median-Flow or TLD in this case of tracking?

    I tried some time ago and Median-Flow has better results in my case.

  4. Do you tested how Meanshift algorithm perform compared with Median-Flow or TLD in this case of tracking?

    I tried some time ago and Median-Flow had better results in my case.

  5. Hi, your case is very simple so practically any respectable tracking method will give you good results. But for your case (specially in the case of a drone), a KLT tracking for video stabilization followed of a mixed KLT/MS tracking will give you a very much better robustness and accuracy, even with deformable objects (and with opencv it is not very complex to code).

    I have implemented something similar some time ago for my master thesis, if you are interested in it, you can see the result and download my thesis here:

      1. Take into account that for tracking you usually not use the image at it’s original size when you use corners or edges, because their core information remains almost intact in downsampled images, and in the very little cases when you need precision at the original size, what you do is a pyramidal analysis.

        The problem with the usual application of MS is that with only the color information, it fails a lot when you have similar histograms (which in real scenarios happens a lot). You can easily boost it’s performance only adding the gradient information to the rest of the histogram (counting for example the number of edge points) and using the channels in a dependant way (using each sample RGBE as a single value in the histogram, this consumes a lot of memory but it worth the extra cost).

        If you later use a KLT method to stabilize the video, then you can even go further, using the features tracked in KLT to learn the MS histogram instead of use a single patch (that will allow you get much better result in deformable objects, and it will allow you to use much smarter learning methods).

  6. Nice job!
    Now I would love to see that running in realtime on the drone. I guess a typical smartphone-type device would be enough (granted we can use it’s GPU for speeding up the computation).
    Imagine what you could do with a modern smartphone computational power! (without even mentionning connectivity like cellular, 3g, GPS, its IMUs etc…)

    Aaaannnd now I need a drone… :-/

  7. Really good job man! I was researching about something slightly different. I wanted to get the GPS coordinates of the object I am tracking using my drone’s gps info.
    I want it really hard for a promising project in my mind. The accuracy is irritating me as well (may be I would make the Drone halt to get a solid gps lock first)
    If it’s possible or you can demonstrate it, I would be more than thankful man!

  8. Hi Christian,

    your tracking software seems to be fast and robust.
    My team and I have created a sales platform for intelligent video analytics. We have a base of clients who would find your software highly interesting.

    The platform is called Blepo, and can be found at As a developer, you can advertise your software entirely for free, to a wide range of users. These users will be able to test your software against their own videos, allowing them to find software that is best suited to their needs.

    The platform is cloud based and secure. You may choose to upload a demo version of your IVA software, or host the demo version on your own servers. If hosted on your own servers, you will receive an online request every time a user would like to test it. This ensures complete protection of your software.

    I would be delighted to have you join the community – the exposure it will bring your software is significant. You can find more information on our community at

    Kindest regards,


  9. how can i fix this problem
    Traceback (most recent call last):
    File “C:/Python27/koko”, line 45, in
    File “C:/Python27/koko”, line 16, in run_main
    roi = frame[r:r+h, c:c+w]
    TypeError: ‘NoneType’ object has no attribute ‘__getitem__’

      1. Which camera did you use? I was just getting started with your code and I’m also getting same nonetype.

  10. HI Chris,

    Can we implement your solution on moving target also? we are interested by your solution.

    1. Hello Elie, sure you can implement it on a moving target, however there are other methods that could improve your tracking, this method was just made using a very simple (and with good performance) method, it depends on how much performance you can get, the quality of the video, etc.

      1. Hi Christian, thanks for the reply, so how can we do the implementation with our drone and gimbal, we are using the pixhawk. What do we need to be able to do it ?

  11. roi = frame[r:r+h, c:c+w]
    TypeError: ‘NoneType’ object has no attribute ‘__getitem__’
    how to solve

    1. As far as I know, the OpenCV license is a 3-clause BSD license ( and I’m citing the link from the documentation example in the post itself (also, if you see, the code isn’t actually exact the same, do you know another way of doing meanshift without using the same API ?), if this is violating the license, please let me know so I can remove the code in question.

  12. How did you give those numbers to the line where you defined the ROI? I mean how did you get those numbers?

    If you can tell me that it’d be great help for me…

  13. Hi Chris,

    Amazing video capture. I need a little help. How did you get the “upabove.mp4” in the program? And also, if I want to circles on the ground how would I change the parameters c,r,w,h?

  14. hey… thank you so much for the code!!umm..actually, my project deals with detecting faces using a drone using FPV (for detection of the wanted for example) can you please please tell me if there’s a code for detecting faces??

  15. Hi Chris,

    I need to implement a follow algorithm for a UAV to follow a Car. The UAV receives the GPS co-ordinates of the car. I would be implementing it in PX4 FMU. I also want to have a good quality video recording while following. How could I use your code?

  16. Hey would this field include Computer Science or Computer Engineering when discussing object tracking in real time? as well as connecting a drone to GPS or computers ? please send me helpful links on these topics if possible..

  17. i want to develop the same using raspberry pi and ardupilot how can i achieve this can you suggest me some usefull tutorials as im a newbie in drones

  18. The tracker seems working fast and reliable!
    I have one concern, here the drone is on a constant distance from the object. So, if we assume that the drone is moving toward the object, does the bounding box of the object keeps track of the movement, in other words, does the bounding box will enlarge as the object becomes closer?


  19. Hi !

    I need your help. I want to know, how can we use live feed of drone to process that live feed to detect person. Is it possible or not ? If yes, than how ? Can we use any drone to get live feed on our desktop and process it according to our need ?
    I’m waiting for your response.


  20. Hello Mr. Perrone,
    I was looking at your code and I am trying to run your code using an A.R Drone 2.0 but nothing is streaming. Am I suppose to add something.

  21. Is there a donloadable executeable. Like an install able app as I am not a programmer. But I am a drone builder and this is very appealing

  22. Set the ROI (Region of Interest). Actually, this is a
    # rectangle of the building that we’re tracking
    c,r,w,h = 900,650,70,70
    track_window = (c,r,w,h)

    can you please explain these lines

  23. Hi Christian

    This would be very useful for a Non Profit UAV based anti poaching project I am working on and it would be great if you would be willing to discuss this and the obstacles we are facing with auto camera and drone positioning.



  24. Hi I try to run but, ihaver this error:

    mask = cv2.inRange(hsv_roi, np.array((0., 30.,32.)), np.array((180.,255.,255.)))

    error: /Users/travis/build/skvark/opencv-python/opencv/modules/core/src/arithm.cpp:1769: error: (-209) The lower boundary is neither an array of the same size and same type as src, nor a scalar in function inRange

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.