How to Set Up AWS App Mesh for Service Communication

Learn how AWS App Mesh enhances service communication for SMBs, focusing on cost efficiency, reliability, and ease of integration.

How to Set Up AWS App Mesh for Service Communication

AWS App Mesh simplifies communication between services, especially for small and medium-sized businesses (SMBs) in the UK. Here's why it matters:

  • Cost: App Mesh is free to use - pay only for AWS resources like EC2 or Fargate.
  • Reliability: Reduces unplanned downtime by up to 69% with built-in metrics, logs, and traces.
  • Security: Encrypts service-to-service communication and manages authentication automatically.
  • Flexibility: Update infrastructure without changing your code.
  • Compatibility: Works with containers (EKS, ECS), EC2, and hybrid setups like AWS Outposts.

Quick Setup Steps:

  1. Prepare Permissions: Use AWSAppMeshFullAccessIAM or custom policies for access.
  2. Choose Compute Environment: Supports EKS, ECS, Fargate, EC2, and more.
  3. Configure Network: Align VPC security with GDPR for UK businesses.
  4. Create Service Mesh: Define virtual services, nodes, and routers for traffic management.
  5. Deploy Sidecar Proxies: Use Envoy for managing service communication.

Important Note: AWS will discontinue App Mesh support after 30 September 2026. Plan implementations accordingly.

AWS App Mesh helps UK SMBs streamline service communication while ensuring security and scalability. Dive into the full guide for detailed steps and best practices.

AWS App Mesh Tutorial (EKS | Ingress | Terraform)

AWS App Mesh

Prerequisites and Initial Setup

Before diving into the configuration of App Mesh, it's important to prepare your AWS environment properly. This preparation helps ensure a smooth deployment and avoids unnecessary delays.

AWS Account and Permissions

Getting the right permissions in place is a critical first step for deploying App Mesh. By default, IAM users and roles don’t have the permissions needed to create or modify App Mesh resources. To address this, your IAM administrator must define specific policies that allow access to App Mesh APIs.

Here’s what you’ll need:

  • Use the AWSAppMeshFullAccessIAM policy or a custom policy with the iam:CreateServiceLinkedRole permission to create meshes (required for meshes created after 5th June 2019).
  • Attach the AWSAppMeshReadOnly managed policy to users who need read-only access to the App Mesh console.

App Mesh relies on service-linked roles, which come with all the permissions needed to interact with other AWS services. Start with AWS-managed policies for simplicity, but aim to transition to a least-privilege model. This means granting only the permissions your users and applications truly need and using policy conditions to enforce stricter access controls. To further enhance security, enable MFA and validate your policies using IAM Access Analyzer.

Once permissions are sorted, the next step is to pick the right compute environment for your application.

Supported Compute Environments

After setting up permissions, focus on choosing the compute environment that fits your application needs. App Mesh supports a variety of compute environments, including:

For hybrid setups, App Mesh integrates with AWS Outposts, enabling service communication for on-premises applications.

When deciding on a compute environment, think about factors like operational complexity, scalability, and how well it integrates with your existing infrastructure. For containerised workloads, EKS or ECS are often the best choices. On the other hand, if you’re working with traditional, non-containerised applications, EC2 instances may be more suitable. Since App Mesh uses the Envoy proxy, it works seamlessly with various AWS partner tools and open-source technologies. This ensures consistent service communication, along with features like end-to-end visibility and high availability.

Network and Security Setup

Network configuration forms the backbone of your App Mesh deployment. AWS follows a shared responsibility model: they manage the security of the cloud, while you are responsible for securing everything within it. To get started, configure your VPC security groups to allow the necessary traffic for App Mesh.

For organisations in the UK, especially those handling personal data, ensure that your VPC setup aligns with GDPR requirements. This might include implementing proper network segmentation and enabling logging to track activity. You can also define customer-managed policies tailored to your specific needs, adding extra layers of restriction beyond AWS's managed policies.

AWS provides several managed policies for App Mesh, such as:

  • AWSAppMeshServiceRolePolicy
  • AWSAppMeshEnvoyAccess
  • AWSAppMeshFullAccess
  • AWSAppMeshReadOnly.

Keep in mind that AWS plans to discontinue App Mesh support after 30th September 2026. If you’re working on long-term projects, it’s wise to plan your implementation timeline carefully and consider alternative solutions for the future.

Step-by-Step Guide to Setting Up AWS App Mesh

Once your AWS environment is prepped, you can configure App Mesh using three primary components: the service mesh, virtual components (like services, nodes, and routers), and sidecar proxies.

Creating a Service Mesh

To kick things off, you'll need to define your service mesh. Think of the service mesh as a logical boundary that manages network traffic between your services.

To create one via the AWS Management Console, head to the App Mesh console and click "Create mesh". Give your mesh a clear, descriptive name to keep your infrastructure organised as it scales.

You can also enable external traffic during setup, allowing services outside the mesh to communicate with those inside. Additionally, you can specify whether to use IPv4 or IPv6 instead of sticking with the default IP version.

For those who prefer the AWS CLI, you can create a mesh with this command:

aws appmesh create-mesh --mesh-name meshName

If needed, you can also use namespace selectors with labels to limit which namespaces are associated with your mesh.

Setting Up Virtual Services, Nodes, and Routers

With your service mesh ready, the next step is to define the virtual components that control service communication:

  • Virtual services act as placeholders for real services. They allow dependent services to reference an actual service by its service discovery name, such as my-service.default.svc.cluster.local.
  • Virtual nodes represent actual workloads running on platforms like Amazon ECS, Kubernetes, or EC2. When setting up a virtual node, make sure the Envoy proxy container includes the APPMESH_RESOURCE_ARN environment variable, pointing to the ARN of the virtual node.
  • Virtual routers manage traffic distribution for virtual services. Once a router is created, you can define routes to direct requests to specific virtual nodes based on criteria like HTTP headers, paths, or traffic weights.

For example, you could send 90% of traffic to a stable version of your service while directing 10% to a new version for canary testing. This allows you to manage traffic dynamically without altering your application code.

Installing Sidecar Proxies

With the virtual components configured, the final step is deploying sidecar proxies to handle inter-service communication. These proxies ensure smooth traffic flow between microservices within your mesh.

For Kubernetes, you can automate sidecar proxy injection into pods using a mutating webhook admission controller. The App Mesh Kubernetes controller uses namespace and pod annotations to determine if sidecar injection is enabled.

For Amazon ECS, update your task definitions to include the Envoy proxy container. Ensure the task's IAM role has the necessary permissions to interact with App Mesh resources.

When allocating resources for the Envoy proxy, aim for sufficient capacity to handle traffic efficiently. For example:

  • Allocate 512 CPU units and at least 64 MiB of memory for the Envoy container.
  • On Fargate, allocate 1,024 MiB of memory for optimal performance.

The Envoy container also needs IAM credentials to sign requests sent to App Mesh for routing information and metrics reporting.

AWS provides two types of Envoy proxy container images: the standard version and a FIPS-compliant version, which is useful for industries with regulatory requirements. Regularly updating your Envoy version ensures you stay current with security patches, performance improvements, and new features.

Monitoring and Troubleshooting in AWS App Mesh

Once your AWS App Mesh is up and running, keeping an eye on its performance is essential to ensure smooth communication between services. Without proper monitoring, identifying and resolving performance bottlenecks can become a daunting task.

Setting Up Logging and Metrics

AWS App Mesh works seamlessly with tools like Amazon CloudWatch, Prometheus, AWS X-Ray, and Datadog to provide robust logging, monitoring, and tracing capabilities. This gives you the flexibility to integrate with tools that align with your existing setup.

Configuring access logs is a key step in understanding how traffic flows through your mesh. When you create virtual nodes and virtual gateways, make sure to enable Envoy access logs. These logs capture detailed information about traffic at both OSI Layer 4 and Layer 7, including response times, status codes, and routing decisions.

If you're using ECS or Fargate, Firelens can be employed as a log router. This approach minimises the overhead on your application containers while ensuring all logs are collected efficiently.

Envoy proxies within the mesh generate statistics, which can be accessed via the /stats endpoint on port 9901. To further enhance monitoring, you can enable the App Mesh metrics extension by setting the environment variable APPMESH_METRIC_EXTENSION_VERSION to 1. For example, in October 2024, AWS showcased "The DJ App" on Amazon ECS to demonstrate how enabling this extension improved visibility. By integrating metrics scraping with CloudWatch Prometheus, they were also able to optimise storage costs.

To collect specific metrics from your proxies, install and configure the CloudWatch Agent. Pay close attention to the control_plane.connected_state metric, as it ensures your Envoy proxies remain connected to the App Mesh control plane.

When analysing logs in CloudWatch Logs Insights, you can use the following parse statements to decode Envoy’s default log format:

For access logs:

parse @message "[*] \"* * *\" * * *\" * * * * * * * * * * *" as StartTime, Method, Path, Protocol, ResponseCode, ResponseFlags, BytesReceived, BytesSent, DurationMillis, UpstreamServiceTimeMillis, ForwardedFor, UserAgent, RequestId, Authority, UpstreamHost

For process logs:

parse @message "[*][*][*][*] [*] *" as Time, Thread, Level, Name, Source, Message

This level of detail helps you understand traffic patterns and troubleshoot more effectively.

Fixing Common Issues

Troubleshooting AWS App Mesh often requires a systematic approach to identify and resolve connectivity, routing, or configuration issues. Start by enabling Envoy access logging and examining sidecar logs for errors such as routing failures, connection issues, or timeouts.

A common problem involves DNS resolution failures. If services cannot resolve virtual service names, ensure a DNS A record is pointing to a non-loopback IP address for the virtual service name.

Connectivity issues to virtual service backends often result from misconfigured compute environments, disconnected Envoy containers, or missing virtual service providers. Check Envoy proxy logs for errors like "No healthy upstream" or "No cluster match for URL" to pinpoint the issue.

For external service communication, verify that your mesh's outbound filter is set to ALLOW_ALL. Alternatively, model external services within the mesh by configuring virtual services, routers, routes, and nodes. If you're facing issues with MySQL or SMTP connections, ensure you're using App Mesh image version 1.15.0 or later, and add the necessary ports to APPMESH_EGRESS_IGNORED_PORTS.

When working with TCP virtual nodes, make sure each destination has a unique port. If traffic is unexpectedly succeeding to destinations not defined in the mesh, set the outbound filter to DROP_ALL to enforce stricter control.

HTTP 503 errors can often be mitigated by switching from virtual node providers to virtual router providers and implementing retry policies on routes. For Amazon EFS connection errors, include port 2049 in the EgressIgnoredPorts configuration.

In one case from April 2025, a user reported intermittent traffic failures involving timeouts and 5xx errors. After enabling Envoy access logging and reviewing sidecar logs, they discovered that requests were being misrouted due to a virtual service misconfiguration. Correcting the configuration resolved the issue.

"Intermittent 5xx issues in service meshes can be a nightmare to trace without the right observability tools and deep inspection of traffic paths. Leveraging access logs and config dumps is essential - and so is ensuring that external data routes and proxy layers are stable and secure." – NetNut.io

Timeout issues often require adjustments at both the route and virtual node listener levels. HTTP 400 errors are frequently caused by proxy protocol version 2 (PPv2) being enabled on Network Load Balancer target group attributes. Disabling PPv2 usually resolves the problem.

During pre-production testing, you can set Envoy's log level to debug by using the ENVOY_LOG_LEVEL variable. This provides more detailed diagnostics.

To monitor gRPC config stream status and troubleshoot connection issues, use the following query in CloudWatch Insights:

filter @message like /gRPC config stream closed/| parse @message "gRPC config stream closed: *, *" as StatusCode, Message

This approach ensures you can quickly identify and resolve communication issues, keeping your services running smoothly.

Best Practices for SMBs Using AWS App Mesh

Using AWS App Mesh effectively requires thoughtful planning, especially when it comes to managing costs, ensuring security, and scaling for growth. Small and medium-sized businesses (SMBs) often face unique challenges with service mesh architectures, but by following proven strategies, they can make the most of this tool while avoiding common hurdles.

Reducing Costs and Managing Resources

One of the advantages of AWS App Mesh is that it doesn't come with extra charges. You only pay for the AWS resources used by the Envoy proxy sidecars running alongside your containers. This makes cost management relatively straightforward, focusing on optimising resource usage.

  • Fine-tune DNS resolution: Efficient DNS resolution not only improves performance but also cuts costs. Depending on your setup, you might need to adjust the default IPv6_PREFERRED setting to IPv4_ONLY or IPv4_PREFERRED if your infrastructure isn’t fully IPv6-ready. Poor DNS configuration can lead to increased latency, failed requests, and extra compute costs from unnecessary retries.
  • Deploy smartly: Costly service disruptions and failed requests can be minimised with effective deployment strategies. For rolling deployments, adjust the pace to reduce divergence and improve retry reliability. For Amazon ECS services, set maximumPercent to 150% for smaller deployments and 125% for larger ones. For Kubernetes, configure maxUnavailable at 0% and maxSurge at 25%.
  • Scale out before scaling in: During deployments, maintaining a higher number of healthy tasks ensures smoother performance and reduces failed requests that lead to resource wastage. Configuring virtual services with default retry policies on all routes can also help prevent cascading failures.
  • Use health checks and dependency ordering: Ensure that Envoy proxies are up and running before starting containers that rely on outbound connectivity.

For more detailed advice on cutting cloud costs without sacrificing performance, check out the AWS Optimization Tips, Costs & Best Practices for Small and Medium sized business blog.

Maintaining Security and Compliance

In service mesh environments, security needs to be multi-layered, addressing both UK regulations and broader cloud security best practices. With most organisations finding traditional security tools lacking in cloud setups, SMBs should prioritise cloud-native solutions.

  • Incorporate Privacy by Design: If your services handle personal data subject to GDPR, bake privacy considerations into your App Mesh setup from the beginning rather than retrofitting them later.
  • Classify and protect data: Establish clear data classifications and apply protection measures, such as mutual TLS encryption for sensitive communications, which App Mesh supports.
  • Apply least privilege principles: Limit service-to-service communication to only what’s necessary using default-deny authorisation policies. This significantly reduces the risk of unauthorised access.
  • Secure logs and retention policies: Ensure logging configurations comply with data retention policies. Treat Envoy access logs, which may contain sensitive information, with strong security controls.
  • Leverage AWS-native security tools: Solutions like AWS Security Hub can centralise security management and automate assessments across your App Mesh environment.
  • Regular compliance checks: Continuously review your infrastructure and applicable regulations to close any security gaps, especially for GDPR and other UK-specific requirements.
  • Segment and secure: Even with App Mesh’s service-to-service encryption, network-level controls and workload segmentation are essential to limit the impact of potential breaches.

Scaling App Mesh for Growing Workloads

As your business grows, so does the complexity of your service interactions. Scaling AWS App Mesh effectively requires careful planning to maintain performance and reliability.

  • Retry strategies matter: Configure virtual routers with default retry policies to handle increased traffic volumes effectively.
  • Stabilise deployments during scaling: Keep at least 100% of tasks healthy while capping total tasks at 125% during scaling events. This prevents service degradation that can occur with overly aggressive scaling.
  • Health checks and dependency management: Ensure all Envoy sidecars are fully operational before starting dependent application containers.
  • Tailor scaling to your runtime: Whether you’re using Amazon ECS or Kubernetes, adjust health checks and deployment strategies to match the specific needs of your container runtime.

Conclusion

AWS App Mesh provides small and medium-sized businesses (SMBs) with an accessible way to simplify service communication while maintaining reliability.

One of its standout advantages is the financial model. There are no direct charges for using App Mesh itself - businesses only pay for the underlying AWS resources, such as EC2 instances or Fargate. This allows SMBs to access enterprise-level features without incurring additional costs. For instance, TLG Aerospace managed to cut costs by 80% compared to their previous cloud and on-premises HPC cluster setups by leveraging Amazon EC2 Spot Instances.

But the benefits go beyond cost savings. App Mesh offers practical operational improvements. It simplifies service management with custom traffic routing rules and strengthens security using authentication controls and encrypted communication. Additionally, the ability to collect metrics, logs, and traces helps businesses quickly identify and resolve issues, leading to a 69% reduction in unplanned downtime - a challenge many face during AWS migrations.

Users have also expressed strong support for App Mesh, with a 100% recommendation rate. The platform eliminates the need for code changes to enable monitoring or routing updates, allowing teams to dedicate more time to strategic growth initiatives.

For SMBs seeking to maximise their AWS investment beyond App Mesh, careful planning and optimisation are key. Resources like the AWS Optimization Tips, Costs & Best Practices for Small and Medium sized business blog provide actionable advice on cutting costs, improving security, and boosting performance. These tips can help businesses extend the benefits of App Mesh and build a solid foundation for their service communication infrastructure.

FAQs

How can UK businesses configure AWS App Mesh while ensuring GDPR compliance?

To set up AWS App Mesh while meeting GDPR requirements, UK businesses need to prioritise data protection principles. First, ensure that all data processed within the App Mesh is stored in AWS regions within the EU or UK, such as London, to comply with data residency rules. This keeps your data closer to home and within regulatory boundaries.

Next, secure sensitive information by enabling encryption for data both in transit and at rest. This adds an essential layer of protection against unauthorised access. Also, review how your services communicate to make sure only the necessary data is exchanged, in line with the principle of data minimisation.

For additional support, explore AWS’s GDPR compliance resources. It’s also a good idea to collaborate with a legal or cloud compliance expert to double-check that your configuration meets all regulatory standards.

What steps should small and medium-sized businesses take to prepare for the end of AWS App Mesh support after 30 September 2026?

To get ready for the end of AWS App Mesh support on 30 September 2026, small and medium-sized businesses (SMBs) should begin crafting a migration plan sooner rather than later. AWS suggests moving to alternatives like Amazon ECS Service Connect or Amazon VPC Lattice, both of which come with powerful features such as traffic management and service discovery to keep your services running smoothly.

If you're looking for more flexibility, consider Istio, an open-source service mesh. It offers advanced traffic control, improved security, and compatibility across multiple cloud platforms, making it a versatile choice for SMBs operating on AWS. Starting the planning and testing process early can help reduce disruptions and ensure a seamless transition, keeping your operations steady and reliable.

What should I consider when choosing between Amazon EKS, ECS, and EC2 for deploying AWS App Mesh in a hybrid environment?

When choosing between Amazon EKS, ECS, and EC2 for deploying AWS App Mesh in a hybrid environment, the right option will depend on your application's specific requirements for flexibility, management, and integration.

  • EKS is well-suited for complex, multi-cloud environments or applications that need a high degree of customisation. It also works seamlessly with on-premises systems, making it a strong choice for workloads with strict data residency rules. However, it demands more operational effort to manage effectively.
  • ECS is a simpler and more cost-efficient solution, especially for applications that run entirely within AWS. Its tight integration with AWS services makes it easier to manage and a better option for straightforward deployments.
  • EC2 offers the highest level of control over your infrastructure, but it comes with increased management responsibilities and lacks the automation provided by the other two options.

Ultimately, the decision comes down to how much control you need, how complex your operations are, and how well the solution aligns with your existing systems.

Related posts