There are simple ways, and not so simple ways to do this, depending on what your requirements are. Some things to consider are: I don't think you need facial recognition, as you mentioned earlier, unless you want to respond to faces you have seen before. Face detection and tracking typically requires training, but I think this would only be required if you have to track faces from a camera mounted higher up. This would enable to detect TOTs that are facing your display. If the detecting camera is mounted more at eye level you would not even need facial detection or tracking, you could simply detect movement. For example, focus the eyes on the top of the biggest moving object within the field of view. The advantage of this would be much simpler software, and no training.