gRPC + REST API on AWS

Showing a setup of gRPC service which is also exposed as a REST API. It’s a setup that happens to work for us. No alternatives will be discussed in this post.

This is a concise blog post.

Architecture

  1. ALB with HTTPS listener (trivially configured, out of scope of this post)
  2. ECS running a task with 3 containers:
    • API Gateway. Implemented by Envoy. does:
      • requests authorization using the service in next container
      • proxies gRPC requests
      • proxies REST requests (converting them to upstream gRPC requests).
    • authorization service implemented with OPA
    • Our gRPC application

Notes

Health checks are not in very good shape yet

ECS Configuration (Simplified Excerpt)

In case the reader is not familiar, it CloudFormation below.

  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      ContainerDefinitions:
        - Name: apigw
          Image: !Ref ApiGwImage
          PortMappings:
            - ContainerPort: !Ref ContainerPort
        - Name: opa
          Image: !Ref OpaImage
          PortMappings:
            - ContainerPort: 9191
        - Name: app
          Image: !Ref AppImage
          PortMappings:
            - ContainerPort: 4000

  Service:
    DependsOn:
      - GrpcListenerRule
      - RestListenerRule
      - GrpcTargetGroup
      - RestTargetGroup
    Type: AWS::ECS::Service
    Properties:
      ServiceName: !Ref ServiceName
      Cluster: !Ref Cluster
      TaskDefinition: !Ref TaskDefinition
      LoadBalancers:
        - ContainerName: apigw
          ContainerPort: !Ref ContainerPort
          TargetGroupArn: !Ref GrpcTargetGroup
        - ContainerName: apigw
          ContainerPort: !Ref ContainerPort
          TargetGroupArn: !Ref RestTargetGroup

  GrpcTargetGroup:
    Type: AWS::ElasticLoadBalancingV2::TargetGroup
    Properties:
      HealthCheckIntervalSeconds: 10
      HealthCheckPath: /
      HealthCheckTimeoutSeconds: 5
      Matcher:
        GrpcCode: "0-99"
      UnhealthyThresholdCount: 2
      HealthyThresholdCount: 2
      Port: !Ref ContainerPort
      Protocol: HTTP
      ProtocolVersion: GRPC
      TargetGroupAttributes:
        - Key: deregistration_delay.timeout_seconds
          Value: 60 # default is 300
      TargetType: ip
      VpcId: !ImportValue VpcId

  RestTargetGroup:
    Type: AWS::ElasticLoadBalancingV2::TargetGroup
    Properties:
      HealthCheckIntervalSeconds: 10
      HealthCheckPath: /rest/not-found
      HealthCheckTimeoutSeconds: 5
      Matcher:
        HttpCode: 404
      UnhealthyThresholdCount: 2
      HealthyThresholdCount: 2
      Port: !Ref ContainerPort
      Protocol: HTTP
      ProtocolVersion: HTTP1
      TargetGroupAttributes:
        - Key: deregistration_delay.timeout_seconds
          Value: 60 # default is 300
      TargetType: ip
      VpcId: !ImportValue VpcId

  GrpcListenerRule:
    Type: AWS::ElasticLoadBalancingV2::ListenerRule
    Properties:
      Actions:
        - Type: forward
          TargetGroupArn: !Ref GrpcTargetGroup
      Conditions:
        - Field: path-pattern
          PathPatternConfig:
            Values:
              - '/censored.v1.CensoredService/*'
              - '/censored.v1.CensoredAdminService/*'
              - '/censored.v1.CensoredSystemService/*'
      ListenerArn: ...
      Priority: 1000

  RestListenerRule:
    Type: AWS::ElasticLoadBalancingV2::ListenerRule
    Properties:
      Actions:
        - Type: forward
          TargetGroupArn: !Ref RestTargetGroup
      Conditions:
        - Field: path-pattern
          PathPatternConfig:
            Values:
              - '/rest/v1/*'
      ListenerArn: ...
      Priority: 1001

Envoy Configuration (Simplified Excerpt)

static_resources:
  listeners:
    - address:
        socket_address:
          address: 0.0.0.0
          port_value: 8000
      filter_chains:
        - filters:
            - name: Connection Manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                via: CensoredGW
                route_config:
                  name: Static response for tests
                  virtual_hosts:
                    - name: backend
                      domains:
                        - "*"
                      routes:
                        - match:
                            prefix: "/test/static"
                          direct_response:
                            status: 200
                            body:
                              inline_string: "Static response for tests"
                        # Reference: https://envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/grpc_json_transcoder_filter#route-configs-for-transcoded-requests
                        - match:
                            prefix: "/"
                          route:
                            cluster: upstream
                            timeout: 60s
                http_filters:
                  - name: envoy.filters.http.grpc_json_transcoder
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.grpc_json_transcoder.v3.GrpcJsonTranscoder
                      # maybe disable later:
                      auto_mapping: true
                      proto_descriptor: "../path/to/proto_descriptor.bin" ### See next heading in this post
                      services:
                        - censored.v1.CensoredService
                        - censored.v1.CensoredAdminService
                        - censored.v1.CensoredSystemService
                      print_options:
                        add_whitespace: true
                        always_print_primitive_fields: true
                      request_validation_options:
                        reject_unknown_method: true
                        reject_unknown_query_parameters: true
                  - name: envoy.filters.http.cors
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.cors.v3.Cors
                  - name: envoy.ext_authz
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthz
                      failure_mode_allow: false
                      with_request_body:
                        max_request_bytes: 10485760 # 10M
                        allow_partial_message: false
                        pack_as_bytes: true
                      transport_api_version: V3
                      grpc_service:
                        envoy_grpc:
                          cluster_name: opa-agent
                        timeout: 10s
                  - name: envoy.filters.http.router
                    # https://github.com/envoyproxy/envoy/issues/21464
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
                always_set_request_id_in_response: true
                access_log:
                  - typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                      # https://www.envoyproxy.io/docs/envoy/latest/configuration/observability/access_log/usage#config-access-log-default-format

  # Based on https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/grpc_json_transcoder_filter
  clusters:
    - name: opa-agent
      connect_timeout: 0.25s
      type: STRICT_DNS
      typed_extension_protocol_options:
        envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
          "@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
          explicit_http_config:
            http2_protocol_options: { }
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: service
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: 127.0.0.1
                      port_value: 9191
    - name: upstream
      type: STRICT_DNS
      typed_extension_protocol_options:
        envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
          "@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
          explicit_http_config:
            http2_protocol_options: {}
      load_assignment:
        cluster_name: grpc
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: 127.0.0.1
                      port_value: 4000

proto_descriptor.bin

GrpcJsonTranscoder must have the proto descriptor file in order to know how to transcode. The file contains:

  1. proto definitions of your services, including extension that describes how to expose the services as REST
  2. dependencies of the above proto definitions

The descriptor file is generated using a command similar to the following:

buf build -o proto_descriptor.bin --as-file-descriptor-set --path path/to/my.proto

buf is a way to manage .proto files and their dependencies (very imprecise definition, sorry)

If I remember correctly, you can generate the descriptor with protoc (without buf) but I don’t remember how.

grpcurl

Same descriptor file is used with grpcurl when you later test your service from the command line:

grpcurl -H "Authorization: Bearer ..." -protoset proto_descriptor.bin "example.com:443" censored.service.name/MyFunc

my.proto

This is how a protobuf definition with REST extension looks like (excerpt):

import "google/api/annotations.proto";

service Censored {
  rpc MyCreate(CreateRequest) returns (CreateResponse){
    option (google.api.http) = { post: "/rest/v1/my-objs" };
  }
  rpc MyGet(GetRequest) returns (GetResponse) {
    option (google.api.http) = { get: "/rest/v1/my-objs/{id}" };
  }
}

Excerpt from buf.yaml corresponding to the import above:

version: v1

deps:
  - buf.build/googleapis/googleapis


Hope this helps.

Sorry, I was in a rush to get this out. If anything is unclear or missing, please let me know.

The Case for Concise Posts

According to DiSC, about a quarter of all people should be communicating like me. We want information, not fluff or stories. We are here to get the answer to our question: “what’s X?” (a technology, a format, a piece of software, etc). Yet, the number of blog posts which answer “What’s X?” concisely is roughly zero. I am going to fix this with my future posts as time allows. Stay tuned.

Everything below is implementation detail. You can stop reading here and save a few minutes.

Here is my plan. Feel free to use it as a guideline for your blog posts too.

Discretion

We are looking for the author’s discretion about what’s important (hint: typically concepts and architecture), not a dump of everything that the author knows about the topic. We are here for “I would have written a shorter letter, but I did not have the time“. Yes, that 5 minutes read should take hours if not days to write. Otherwise, what’s the value?

Can’t answer the question concisely? We doubt your understanding of the topic.

Context

“Context is important, the blog post must provide context, blah blah …”.

The context was already established by searching “what’s X” or following a link to the post. It’s annoying when the text starts with fluff, keeping us wondering when and whether my question will be answered.

If there is some *really* important context, that’s not 5 paragraphs. Sorry, storytellers.

Show why the Topic is Important

“You need to show why X is important and where it’s used”.

This information is available in every other blog post and/or on Wikipedia and/or is one search away.

Underlying Concepts

Don’t explain the underlying concepts, link to them (like DiSC above). Don’t waste our time if we already know that.


Stay tuned.

Event Loop for Beginners

The aim of the post is to give a simple, concise, and high level description of the event loop. I’m intentionally leaving out many details and being somewhat imprecise.

If you need detailed picture, follow the links in this post. In this case, I recommend reading this short post first.

Why Event Loop?

To deal with concurrency (multiple “things” need to be happening at what looks like the same time), there are few models. To discuss the models, we need to know what a thread is.

Roughly, a thread is a sequence of computing instructions that runs on a single CPU core.

Roughly, Event Loop manages on which tasks the thread works and in which order.

Queue

The queue contains what’s called in different sources messages, events, and tasks.

Message/event/task is a reference to a piece of code that needs to be scheduled to run. Example: “the code responsible for handling the click of button B + full details of the click event to pass to the code”.

Event Loop

  1. The queue is checked for new tasks. If there is none, we wait for one to appear.
  2. The first task in the queue is scheduled and starts running. The code runs till completion. In many languages, await keyword in the code counts as completion and everything after await is scheduled to run later – new task in the queue.
  3. Repeat from number 1. That’s the loop. It’s called Event Loop because it processes events from the queue in a loop.

Adding Events to the Queue

Tasks are added to the queue for two reasons:

  1. Something happened (user clicked on a button, network packet arrived, etc).
  2. The code that was running in step 2 of the event loop scheduled something to run later.

See Also

  1. Event Loop documentation at MDN.
  2. What is the difference between concurrency and parallelism? at StackOverflow

Hope this helps with high level understanding of Event Loop. Have a nice day!