Vision based control of robot arm

my name is Hemraj. I am a student at a University in Germany. I would like to ask about vision based control of the robot arm.
We have a Universal robot UR10. We have to control the robot using a stereo camera and carry out pick and place operations. How can we carry out the task?
We are working on Object detection and depth camera (intel Real sense D435i) using R-CNN and the results are good. but I am not understanding how can I control a robot arm. I am not sure if I have to work with ROS or not.
Any Input would be helpful. Thank you in advance.