Chandima
Fernando
,
Hailey
Marcello
,
Jakub
Wlodek
,
John
Sinsheimer
,
Daniel
Olds
,
Stuart I.
Campbell
and
Phillip M.
Maffettone
*
National Synchrotron Light Source II, Brookhaven National Laboratory, Upton, NY 11973, USA. E-mail: pmaffetto@bnl.gov
First published on 20th March 2025
The integration of robotics and artificial intelligence (AI) into scientific workflows is transforming experimental research, particularly at large-scale user facilities such as the National Synchrotron Light Source II (NSLS-II). We present an extensible architecture for robotic sample management that combines the Robot Operating System 2 (ROS2) with the Bluesky experiment orchestration ecosystem. This approach enabled seamless integration of robotic systems into high-throughput experiments and adaptive workflows. Key innovations included a client-server model for managing robotic actions, real-time pose estimation using fiducial markers and computer vision, and closed-loop adaptive experimentation with agent-driven decision-making. Deployed using widely available hardware and open-source software, this architecture successfully automated a full shift (8 hours) of sample manipulation without errors. The system's flexibility and extensibility allow rapid re-deployment across different experimental environments, enabling scalable self-driving experiments for end stations at scientific user facilities. This work highlights the potential of robotics to enhance experimental throughput and reproducibility, providing a roadmap for future developments in automated scientific discovery where flexibility, extensibility, and adaptability are core requirements.
Deploying robotics inside or alongside the unit operations of scientific experiments—or ‘end stations’ in the parlance of user facilities—has been explored as a potential solution for increasing scientific throughput, efficiency, and safety.7–11,16 These deployments include robotic arms at the core of workcells,21,22 mobile robots as integrators between various unit operations,23 and active cooperation between cobots and human researchers.7,24 This latter direct cooperation is not possible in certain environments due to safety constraints, albeit robotic arms can integrate with other actuation for mobility within protected spaces.16 Many of the existing approaches depend on vendor supplied software for robot orchestration6,7,17–19,23 and develop ad hoc tooling to combine robotics and existing equipment.1,22,25 We recently demonstrated the use of the Robotic Operating System 2 (ROS2) for facile integration of a pick-and-place robot solution into an existing end station;16 however, there remain limitations in flexibility, adaptability to failure, and extensibility to new experiments.
With these advancements in automation, there is a growing motivation in the scientific community for autonomous, or self-driving, experiments that leverage artificial intelligence in their operational and scientific decision making.20 Robots can leverage AI agents in their internal planning, for example in path planning26 or environment recognition.4 At a higher level, these systems can also leverage agents for choosing their next experiment.27 As such, when integrating robots into experiment orchestration, it is crucial to ensure the availability of algorithmic integrations at varying levels of decision making.
An optimal arena for the development of scientific experiments driven by smart robotics is provided by large scientific user facilities, such as the NSLS-II. The NSLS-II is a light source that offers state-of-the-art capabilities for probing the structural and electronic properties of materials at atomic and microscopic scales.13 As next-generation light sources have continually increased the flux of their facilities, attention has turned toward experimental orchestration that can make the most effective use of these photon beams through advanced automation and AI techniques. Experiment end stations at NSLS-II (called beamlines) that offer routine techniques at scale represent a strategic evolution towards leveraging these advanced capabilities for high-throughput characterization tasks. Building on this concept, Beamline as a Service (BaaS) has emerged as a pioneering model, re-imagining how researchers interact with and utilize the facilities at a synchrotron.20 A BaaS framework envisions a network of self-driving beamlines, equipped with core information technologies, robotics, and agentic AI that can operate autonomously or in concert with human researchers. This potential ecosystem will maximize resource utilization and collaboration to catalyze advancements in energy and sustainability. Nonetheless, this will require substantive effort in technology development, and even more so in technology integration.
Given the complexity and diversity of scientific experiments conducted at NSLS-II, the Bluesky project‡ was developed as an open-source ecosystem offering unparalleled capabilities for orchestration, data management, and analysis.28 This project can be utilized piecemeal and has increasing adoption globally, including across all six U.S. Department of Energy light and neutron sources. At its core, the Bluesky RunEngine coordinates intricate experimental workflows, while complementary packages like Ophyd, Tiled, and Bluesky Adaptive enable seamless instrument integration, advanced data management, and adaptive experimentation, respectively.29 Its intuitive Python interface empowers researchers to design and execute experiments that were previously infeasible, including those which leverage robotics.16 Furthermore, the project commitment to open-source development and community-driven innovation ensures that it remains at the forefront of scientific software. By bridging cutting-edge tools with accessible design, Bluesky not only enhances experimental efficiency but also opens new opportunities to deploy computational agents, drive adaptive science, and tackle the nonlinear challenges of discovery in materials science and beyond.
Considered in combination, these advancements in robotics for self-driving laboratories, the cutting edge infrastructure of large scientific user facilities, and the development of experimental orchestration tooling have created substantial opportunities for accelerating scientific discovery, albeit with some unresolved limitations. Robotic deployments at user facilities must be reconfigurable, extensible, and robust to slight variations in environments. To achieve this, they must interface simply with existing experiment orchestration and other contemporary technologies (e.g., AI). These open challenges are largely related to the daunting task of integrating complex tools as opposed to the development of new tools.
Herein, we describe a generic and extensible architecture for the flexible deployment of robotic sample management at experimental end stations using Bluesky. We combined ROS2 control with Python abstractions that provide seamless integration of a robotic arm into Bluesky orchestration. We leveraged a development pipeline that includes simulation and test environments for flexibility and extensibility to new end stations. We then extended this application with computer vision and a sample database to ensure adaptability to failure and variable environments. Lastly, we closed the experimental loop by leveraging the tooling of Bluesky Adaptive for autonomous agent integrations in self-driving experiments. Our unique contributions in this work include the extension of a ROS-Bluesky interface to include real-time feedback and interruption capability, the integration of computer vision and sample management in the application, and the first demonstrated integration of Bluesky Adaptive with a robotic system for autonomous experiment execution. This solution has immediate implications at the many facilities deploying Bluesky, brings the community closer to the BaaS vision, and provides a road-map for other researchers or facilities seeking to leverage robotics to accelerate scientific research.
The Ophyd integration primarily depends on the use of Python's Future objects for asynchronous applications. In recent work, we demonstrated how the Future of a ROS2 Action could be exchanged for an Ophyd Status object to enable the RunEngine (akin to a ROS2 executor) to conduct an experiment that combines an existing end station and a new robot application (e.g., sample management).16 In the work described herein, we extended our previous developments to use Action feedback to create progress visualization of the completion of a robot action, and to enable the RunEngine to “interrupt” an ongoing experimental plan. In this case, the Actions were equipped with logic to pause, stop, abort, cleanup, and rewind to a recent state. In our example of sample management, this empowers the user to change their mind while a robot is loading a selected sample, pause, return the chosen sample if it is already grasped, then select a new sample.
To leverage robots for self-driving experiments in Bluesky we depended on the Bluesky Adaptive package. This is an actively developed and growing component of the Bluesky ecosystem, designed to provide a harness for adaptivity and intelligent decision making in experiment workflows. At its core, Bluesky Adaptive provides a flexible API that supports a spectrum of adaptive algorithms, ranging from simple rule-based approaches to complex AI-driven models. Primarily for enabling experiments to dynamically respond to data and adjust measurement strategies in real-time, it also accommodates non-interventional agents that process data and generate visualizations to guide researchers without directly controlling the experiment. To categorize the varying levels of adaptivity achievable within this framework, we conceptualize adaptive behavior along three key axes—decision-making rate, degree of signal abstraction, and processing modality [Fig. 1]. Up-to-date details on the design, implementation, and instructions for use can be found in the online documentation.§ Herein, we implemented a simple agent that consumed data and performed a random walk based on that data using the Bluesky Adaptive harness.
The client integrates ROS2 Action-client nodes as Ophyd objects, and provides the overall experiment orchestration. This request for a robot action and subsequent data acquisition is facilitated by the Bluesky RunEngine. The next subsequent sample to be measured is suggested by Bluesky Adaptive—a library from the Bluesky ecosystem designed for intelligent and adaptive decision-making—based on the previous measurements.29 The proposed sample from Bluesky Adaptive is then queried against a database containing sample information to retrieve the corresponding ID for the pose estimation process. Together, data acquisition, database integration, and sample proposals complete the feedback loop between the Bluesky Adaptive agent and the RunEngine execution. The final step in the client-side operations is packaging the sample ID into the ROS2 Action goal message, which is sent to the Action server.
We implemented three key functionalities in the design of the Action server. Our design allowed the cancellation of a goal during execution, tracked the progress of goal execution, and reported updates back to the client. To accomplish this, we used a finite state machine (FSM) to orchestrate the pick-and-place sequence for a given sample [Fig. 3]. We integrated robot control with this state machine and broke it into components that included robot arm path planning, pose estimation, and gripper control. We used MoveIt,¶ a ROS2 library for robot motion planning and execution, alongside ROS2 drivers provided by the robot component manufacturers. We further developed the gripper control module using ROS2 Service nodes, enabling the gripper to operate synchronously with the state machine while transitioning its state based on real-time gripper status feedback. We handled sample pose estimation with a separate Service node that interfaced with the state machine through ROS2 communication. Each ROS2 node in this architecture was launched in a separate container to provide robust availability and process management.
Several types of messages and information are passed inside this architecture. Primarily there is the ROS message between the primary client and server describing the overall goal of the robot action. This message contained the robot pose for approaching and placing the sample in the end station, a boolean flag for whether the sample was being placed or returned to the library, and a sample ID. The communication to the client included an update of the fractional completion of the pick-and-place task, and whether it completed successfully. Without computer vision, the pose for the robot to approach and pickup the sample were also included. Messages for MoveIt services were abstracted using the built-in “Move Group Interface”. Messages for the gripper contained the percentage of the range of motion for the gripper to open or close to. Within Bluesky information is passed to the RunEngine using Python generator coroutines, and to the adaptive agents using the document model prescribed by Bluesky.
![]() | ||
Fig. 4 The robot Action progression used a finite state machine to track progress. This FSM provided dynamic feedback to the Bluesky RunEngine for progress updates and synchronization management. |
The state machine was designed to provide an intuitive pick-and-place process that was configurable and flexible for multiple end-stations. Each state transition is contingent upon the successful execution of the motion planning and execution of the robot trajectory. Instances of failure break the state propagation, and a failure message is communicated back to the action client. The 12-state FSM begins in a predetermined Home pose, which is configurable through ROS2 node parameters, that provides a safe robot position for other actuation at the end station and opens the gripper.
Upon accepting a goal request from the client, the Action server begins to execute the FSM, re-initializing to Home then progressing to Pickup Approach. State Pickup Approach was designed to reduce path planning complexity in approaching the library of available samples. When using computer vision to determine the location of the desired sample, this state determines the field of view for the camera. Upon successful completion of the trajectory, and successful sample pose return by the Pose Estimation Service, the FSM transitions to the Pickup state. In this state, the robot moves to a pose where the gripper fingers are orthogonal to the face of the sample, then moves such that the gripper surrounds the sample. Success here triggers a grasp attempt closing the gripper fingers, that progresses the FSM to a specific Grasp Success or Grasp Failure state depending on the closure of the gripper. Grasp success was determined by achieving the desired fraction of closure, e.g., if the gripper was to close too much or too little this would be considered a failure. In the future, this could be extended to include visual confirmation or other signal fusion. At any point prior to this grasp, the user has the option to change course, halt the FSM, and return to Home.
Following a successful grasp of the sample, the FSM transitions the robot back to the Pickup Approach pose in a new state Pickup Retreat. The placing of the sample into the experiment apparatus, in our case the X-ray beam path, then follows a similar series of states as picking up the sample. The Place Approach state uses a trajectory in joint space to prepare the sample for placement, and Place uses a cartesian path to align the sample into the apparatus. This location is configured as a static location similarly to the Home state, and can be rapidly reconfigured using the ROS parameters at startup. Both of these parameters are meant to be relatively static throughout a deployment, as such are loaded at startup and not reloaded at each relevant state execution in the FSM. Adjusting the parameters required a reboot of the Action server container. Other parameters such as the obstacles in the environment could be modified dynamically and would update the planning scene through callbacks. A Release operation by default assumes success as long as the gripper opens; however, could include additional validation using sensors at the end station. Lastly, Place Retreat moves the robot to a position where it can clear the geometry of the sample in its path planning to return to Home.
At this stage, Action execution is finalized and the client can continue the rest of the experiment as orchestrated in Bluesky. In our development environments, this involved a simulated detector taking noise. At the PDF beamline, this involved moving the platform that the robot was mounted on such that the robot would not interfere with the X-ray beam, then progressing with the automated data collection on the sample [Fig. S1†]. Following the experiment execution, the sample could be returned back to storage using a new Action goal from the client. The same FSM drives this object manipulation, with the Pickup and Place poses reversed.
We used computer vision and fiducial makers for pose estimation of the sample holder. We deployed an Azure Kinect DK 12-megapixel camera with a 1-megapixel Time of Flight (ToF) depth camera. Captured images had a 640 × 576 pixel resolution and a field of view of 75° × 65°. We deployed all services as containers on virtual machines within a common subnet with the exception of the camera services. We connected the camera to a physical server (System 76 Meerkat computer powered by 13th gen Intel processor and 64GB of memory) to accommodate its USB interface. The fiducial markers were from the marker family DICT_APRILTAG_36h11 in the ArUco class of markers, and were affixed to a common relative location on all samples, and printed to at 26.65 cm in width and height. DICT_APRILTAG_36h11 family of makers are compatible with both ArUco pose detection and AprilTag pose detection.30,31 The samples we used for this study were brackets that hold varying sizes and counts of capillaries for the PDF beamline. The brackets at the beamline were made of machined aluminum, and those at the experimental testbed were 3-d printed to match this form factor.
We then closed the experimental loop using Bluesky Adaptive in combination with a sample database. From the pose estimation, the Bluesky-ROS client could request any sample be loaded by referencing a unique identifier, then perform a measurement on that sample. We built an in-memory sample database using Redis|| to store sample data in relation to the ArUco identifier, so that the resulting Bluesky plan could reference the ArUco tag directly. We defined the measurement as a Bluesky plan, and used a simulated detector to create data in our tabletop setup. Lastly, we used Bluesky Adaptive to integrate a procedure that consumed data and decided which sample to measure next. For this, we used a lockstep approach in Bluesky Adaptive that placed the agent in-process with the RunEngine. In this approach the RunEngine piped it's document stream to the agent, which consumed images from the detector, and requested the next sample to measure based on that image data. Thus, we closed the loop in a workflow that measured a sample, chose which sample to measure next, cross referencing that sample to an ArUco identifier, and loading that sample prior to measurement.
The Bluesky Adaptive package provides a harness for arbitrary agents to act on data and drive experiments in the Bluesky ecosystem.29 In practice with Bluesky Adaptive these agents can be assembled in a simple closed loop, or a large agentic network of communicating decision makers. As the degree of intelligence and type of reasoning can be highly experiment specific,27 we endowed our experiment with limited intelligence to demonstrate the pipeline integrating a closed loop measurement. We used a Markov Chain Monte Carlo that consumed the readings that are produced by the Bluesky simulated detector, and suggested the next sample to load and measure. The agent transition probabilities were based on a uniform distribution with acceptance probabilities based on a random pixel value. In a production setting, this agent should be endowed with knowledge of the physics or experiment characteristics. This demonstrated the integration of two levels of adaptive decision making in the architecture: first computer vision at a control level, and secondly agent authority at a scientific level.
To evaluate the robustness of our system to variations in sample placement and angular positioning, we conducted a series of repeatability tests using fiducial marker-based pose estimation with an Azure Kinect DK camera. For these tests, the camera was mounted at a fixed position separate from the robot wrist. The sample holders were systematically positioned at 10 cm intervals along the workspace, with varying yaw angles relative to the camera. Successful detections and grasp attempts depended on two primary criteria: (1) the fiducial tag must remain visible to the camera within its field of view, and (2) the sample holder must be oriented within an angle range that allows the robotic arm to approach and grasp it effectively. Grasp failures primarily occurred when samples were placed more than 65 cm from the camera [Fig. S2†]. In such cases, inaccurate pose estimations of occluded fiducial markers resulted in a 50% success rate (2 out of 4 grasps). We also measured the acceptable yaw range that ensured an orthogonal approach for the robot grasp through manual variation, and logged these ranges for each position above [Table S1†]. While fiducial markers provided reliable pose estimation under most conditions, certain fundamental limitations must be noted. Detection accuracy is influenced by occlusion, extreme viewing angles, distortions in the marker's surface, and lighting conditions, which can impact the consistency of marker recognition. On accurate pose estimation, a successful grasp still requires an orthogonal approach by the gripper, thus imposing limits on the sample orientation.
Our approach to pose estimation and integration into the ROS2 service provides a clear path for extensibility to variable environments and engineered sample morphologies. Nonetheless, there are several opportunities for enhancement in this approach. Firstly, the requirement for a local physical server to execute the camera nodes produced a lethargy in our overall system infrastructure, and a networked camera would be an immediate improvement. We chose to use the Azure Kinect DK to exploit its ToF depth sensing capabilities; however, found that the repeatability in pose estimation using available foundation models was insufficient for our application.33,34 We expect that as the field advances, there will be new opportunities to replace the fiducial markers. Extending the computer vision capabilities would only involve replacing the algorithm in the Pose Estimation service. At present, the reliability requirements for our systems led us to engineer sample holders and fiducial markers. Given the routine capabilities in additive manufacturing, we see ample opportunities to extend this system to use diverse sample morphologies, and increase the precision of the fiducial marker relative to the grasp point of the sample by printing the marker directly into the sample holder. Furthermore, as the experiment environment becomes more densely populated with samples, there will be a need to integrate the object Pose Estimation service with the obstacle registration services for dynamic collision avoidance.
While not innovative from an AI algorithm standpoint, our use of Bluesky Adaptive created an extensible closed-loop system between samples and experiments that paves way for increasingly complex self-driving measurements at user facilities. The Bluesky Adaptive harness is built to be extended to arbitrary levels of decision making complexity, and provides a framework for higher order decision making in Bluesky orchestrated experiments [Fig. 1]. The agents built using this harness can also be automatically deployed with an HTTP interface, opening up opportunities for designing effective human-agent interaction, introspection, and control with web tools. Prototype self-driving experimental campaigns have been accomplished using only Bluesky orchestrated automation before,27 albeit this integration of sample databases, computer vision, and robotic sample management with Bluesky Adaptive provides a framework for long running campaigns in the BaaS model. Future work at facilities leveraging this technology should focus on extending the diversity of experimental environments that can be integrated with this kind of solution. Currently, the effort required to modify or extend this architecture to new tasks depends on the degree of modification. Performing the same pick-and-place task in a different environment or with a different sample morphology is a trivial change that only involves parametrization. Adjusting the states in the FSM or adding additional states and transitions would require adjustment of the server source code. Nonetheless, there are opportunities to make this approach more easily extensible to completely novel tasks without substantial reprogramming.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5dd00036j |
‡ https://blueskyproject.io. |
§ https://blueskyproject.io/bluesky-adaptive. |
¶ https://moveit.ai/. |
|| https://redis.io/. |
This journal is © The Royal Society of Chemistry 2025 |