7 min to read

Enabling trusted research at scale with AWS and SoftwareOne

Maarten Bruntink
Maarten BruntinkGlobal AWS Solutions Director
hidden-cost-standing-still-microsoft-environments-getty-607477465-blog-hero

In the first blog post in this series, we explored the real-world risks of not securing research data properly. From shadow data flows and uncontrolled exports to over-permissive collaboration and fragmented audit trails, the message was clear: research organizations need a better way to protect sensitive data without slowing discovery.

This is where Trusted Research Environments, or TREs, become important.

A TRE provides a secure and governed environment where approved researchers can analyze sensitive data without that data being copied, downloaded, or distributed across uncontrolled locations. Instead of sending data to researchers, the model brings researchers to the data. Access is controlled, activity is logged, data movement is restricted, and only approved outputs are allowed to leave after review.

This article looks at how that model can be implemented on AWS. It explains how a secure cloud foundation supports trusted research, how governed data platforms and research workspaces fit together, and how controlled data transfer and output review help institutions make sensitive data usable while maintaining trust, compliance, and accountability.

The foundation: Secure Research Environment

Before a Trusted Research Environment can operate effectively, the underlying cloud environment must be secure, governed, and repeatable. This is the role of the Secure Research Environment, or SRE.

An SRE acts as the landing zone for sensitive research workloads. It provides the AWS foundation on which trusted research capabilities can be built, giving institutions a consistent baseline for security, compliance, and operational control.

In practice, this means a multi-account AWS architecture with centralized identity & access management, controlled account provisioning, segmented networking, encryption, centralized logging, security monitoring, backup, policy guardrails, and automated infrastructure management. Instead of every research project building its own isolated environment from scratch, workloads are placed into a governed structure where core controls are inherited by default.

The SRE can be implemented using typical AWS landing zone patterns which bring together capabilities such as:

Capability Purpose
Multi-account structure Separates workloads, environments, shared services, logging, security, and operational boundaries.
Controlled account provisioning Creates new research accounts or environments through repeatable patterns, reducing manual setup and configuration drift.
Central identity and access Enables federated access, MFA, role-based access control, privileged access management, and consistent user governance.
Network segmentation Controls connectivity between research workloads, shared services, data environments, on-premises networks, and external endpoints.
Private connectivity Uses private network paths and VPC endpoints where appropriate to reduce exposure and constrain service access.
Central logging and monitoring Provides audit trails, operational visibility, security event collection, and a consistent evidence base.
Encryption and key management Protects data at rest and in transit, including use of customer-managed keys where required.
Guardrails and policies Applies preventive and detective controls across accounts and organizational units.
Backup and recovery Defines backup, retention, recovery, and resilience patterns for regulated workloads.
Continuous assurance Supports ongoing monitoring, configuration checks, security posture management, and evidence collection.

When these capabilities are combined, the SRE provides a strong technical baseline for research workloads that need to align with compliance frameworks such as ISO 27001, NIST 800-171, HIPAA, CMMC, GDPR, or other sector and region-specific requirements.

From secure foundation to trusted research

Once the foundation is in place, the next challenge is enabling researchers to work with sensitive data without distributing that data across unmanaged locations.

The scale and complexity of modern collaborative research make the historical model of transferring sensitive data unsustainable. Trusted Research Environments offer a secure alternative by bringing researchers to the data, allowing them to conduct analysis within a governed platform where only approved outputs can be exported.

As described in the first blog post, the TRE makes this possible through three components: a governed data environment, data movement controls, and a controlled Virtual Research Environment.

Case study

SoftwareOne helped a UK nonprofit health think tank build a secure Trusted Research Environment (TRE) on AWS, moving from constrained on-premises systems to a cloud-based research data platform that meets strict NHS governance requirements. Built using AWS Landing Zone and SoftwareOne’s AWS expertise, the TRE enables researchers to securely analyze anonymized NHS patient data, combine multiple data sources, improve data quality, and generate insights at scale while maintaining privacy, compliance, and controlled data access.

Read the full story

Controlled data transfer and output review

The Data Review and Transfer Component, or DRTC, is an AWS-developed solution for reviewing, approving, automating, and auditing sensitive data transfer requests into and out of secure environments such as Trusted Research Environments. It is designed to help organizations control how data, code, results, and other research artefacts move across the TRE boundary. This capability is becoming increasingly important as trusted research models mature.

In practice, DRTC turns data movement into a governed workflow. Incoming datasets can be placed into a staging or quarantine location, where automated checks and/or manual review can take place before data is moved into the data platform. Outgoing results can follow a similar pattern, with outputs reviewed and approved before release.

This is all managed through web-based portal where, administrators can configure the review process for each storage location. DRTC supports a two-stage approval workflow with a first reviewer and optional additional reviewers. These reviewers can be individual users or groups, such as a data governance committee, subject matter experts, or an independent review group to support segregation of duties.

The governed data platform

Behind the transfer boundary sits the governed data platform. This is where sensitive datasets are stored, curated, catalogued, prepared, and made available for approved research projects.

This layer should not be treated as a generic storage bucket. It is one of the most important design areas in the TRE because it determines how usable, reusable, governable, and scalable the research environment becomes.

The right design depends heavily on the research domain, data types, regulatory context, analytical methods, and collaboration model. A genomics platform, a clinical research environment, a defense research dataset, an industrial R&D environment, and a social science data platform will not have identical requirements.

That is why this layer should typically start with a data platform assessment. The assessment should clarify:

Assessment Table
Assessment area Questions to answer
Data sources Where does data come from, who owns it, and how is it ingested?
Data sensitivity What classifications apply, and what restrictions follow from them?
Data structure Is the data tabular, imaging, genomic, sensor-based, unstructured, or multi-modal?
Data preparation What validation, transformation, pseudonymization, de-identification, or curation steps are required?
Access model Which projects, roles, and users should access which datasets?
Metadata and discovery How will researchers find approved data without overexposing sensitive assets?
Lineage and reproducibility How will datasets, transformations, queries, and outputs be traced?
Lifecycle and retention How long should data be retained, archived, or removed?
Analytics needs Are researchers using SQL, notebooks, HPC, ML, generative AI, statistical tools, or domain-specific applications?

Based on this data platform assessment, the architecture can be tailored to the research domain, governance model, and analytics needs. In some cases, for example, Amazon SageMaker Unified Studio can provide a governed data and AI environment, bringing together Amazon EMR, AWS Glue, Amazon Athena, Amazon Redshift, Amazon Bedrock, and SageMaker AI.

Researchers can work with notebooks, SQL, Python, machine learning workflows, and natural language queries, connecting to data in Amazon S3, AWS Glue Data Catalog, Amazon Athena, and Amazon Redshift. Amazon S3 provides the storage foundation, with lakehouse architecture, data cataloguing, access control, data product ownership, and scalable processing added where needed.

Secure research workspaces with RES

Researchers interact with the TRE through the Virtual Research Environment. This is where approved users access compute, software, and project data inside defined boundaries. The goal is to provide a practical working environment while keeping sensitive data inside the controlled TRE architecture.

Research and Engineering Studio on AWS, or RES, can support this workspace layer. RES is an AWS-supported, open-source solution that provides a web portal for scientists and engineers to run technical computing workloads on AWS. Users can launch secure Windows or Linux virtual desktops, use existing corporate credentials, and work in individual or collaborative projects. Administrators can define project spaces, assign software stacks, attach shared file systems, monitor usage, and set project budgets to help control consumption.

RES also includes other controls that are useful in trusted research settings, such as identity integration with SAML 2.0 or OIDC, desktop sharing profiles, restricted file browser access, controlled SSH access, custom permission profiles, and private VPC deployment patterns with VPC endpoints. These controls help limit data movement and reduce the risk of data leaving the research boundary.

Conclusion: The proven route to protection and progress

Organizations looking to build a Trusted Research Environment on AWS do not need to start from a blank page. AWS brings together scalable cloud infrastructure, mature security and governance services, research-specific solutions, and accelerators that support the trusted research lifecycle.

By combining services and accelerators such as SRE, DRTC, SageMaker Unified Studio, and RES, organizations get a strong head start. They can build from proven patterns, adapt the architecture to their data, compliance, and research requirements, and focus implementation effort where it matters most: governance, integration, researcher experience, and operational maturity.

SoftwareOne helps organizations turn that head start into a practical implementation. We work closely with AWS to design, deploy, and operationalize secure research platforms that align with institutional goals, compliance requirements, and researcher needs. SoftwareOne is also an official and repeatedly selected supplier in the OCRE framework, which provides European education and research bodies with trusted, simplified and accelerated cloud services.

Effective adoption of Trusted Research Environments provides institutions with a model that supports both protection and progress. Sensitive data remains governed, researchers gain access to the environments and tools they need, and organizations can demonstrate that research data is being used responsibly.

That is the new baseline for modern research IT: trust engineered into the platform so researchers can work faster, collaborate safely, and use sensitive data with confidence.

A 3d model of a white object on a white background.

Trusted Research Environments: Get the essential guide

Learn more about how to improve reseach outcomes and data protection. Our new eBook includes guidence on accelerated design and implementation, as well as real world case studies.

Trusted Research Environments: Get the essential guide

Learn more about how to improve reseach outcomes and data protection. Our new eBook includes guidence on accelerated design and implementation, as well as real world case studies.

Author

Maarten Bruntink

Maarten Bruntink
Global AWS Solutions Director