Uploaded image for project: 'DC/OS'
  1. DC/OS
  2. DCOS_OSS-2420

CLONE - Request entity too large error

    Details

      Description

      We are adding a new systemd unit called dcos-registry in to DC/OS. This would be almost like a replacement of private universe. Current private universe supports adding of packages to repo and so does the dcos-registry. The dcos-registry has a cli component that the user can use to do something like dcos registry add --dcos-file=..... 

      Now, this command goes through AR and lands to package-registry which is a service running on one of the agent nodes. But I see that the AR restricts file size to be 1GB ( https://github.com/dcos/dcos/blob/master/packages/adminrouter/extra/src/includes/http/common.conf#L1 ) and this seems to be a problem while adding packages that are greater than 1Gb. I am getting a 413 error from AR in this case. I have created a PR ( https://github.com/dcos/dcos/pull/2347/files ) to change this to 4GB and test if it works. I am no longer getting 413 but I am getting a different error. 

      This is what I have done so far:

      CLI would make a request to AR to upload the file -> AR would send the file to registry service running in one of the agents -> after the file is received, registry responds with 200 OK -> AR prints an exception at this moment -> CLI receives empty response. (I can see from AR that the HTTP status code is 499)

      Below are the logs in details for each step (I have also attached the diagnostic bundle) :

      1. Admin Router recevies HTTP Request CLI. I am not sure what the client 10.0.6.229 means.

      Jan 31 02:15:10 ip-10-0-4-87.us-west-2.compute.internal adminrouter.sh[4296]: 2018/01/31 02:15:10 [notice] 4305#0: *15199 [lua] service.lua:104: resolve_via_marathon_apps_state(): Resolved via Marathon, service id: `registry`, client: 10.0.6.229, server: master.mesos, request: "POST /service/registry/add HTTP/1.1", host: "takirala-elasticl-p3zv024rg392-520521878.us-west-2.elb.amazonaws.com"
      Jan 31 02:15:10 ip-10-0-4-87.us-west-2.compute.internal adminrouter.sh[4296]: 2018/01/31 02:15:10 [notice] 4305#0: *15199 [lua] common.lua:125: validate_jwt(): UID from the valid DC/OS authentication token: `bootstrapuser`, client: 10.0.6.229, server: master.mesos, request: "POST /service/registry/add HTTP/1.1", host: "takirala-elasticl-p3zv024rg392-520521878.us-west-2.elb.amazonaws.com"
      Jan 31 02:15:10 ip-10-0-4-87.us-west-2.compute.internal adminrouter.sh[4296]: 2018/01/31 02:15:10 [notice] 4305#0: *15199 [lua] ee.lua:122: check_access_control_entry_or_exit(): Consult policyquery via `/internal/acs/api/v1/internal/policyquery?rid=dcos:adminrouter:service:registry&uid=bootstrapuser&action=full`, client: 10.0.6.229, server: master.mesos, request: "POST /service/registry/add HTTP/1.1", host: "takirala-elasticl-p3zv024rg392-520521878.us-west-2.elb.amazonaws.com"
      Jan 31 02:15:10 ip-10-0-4-87.us-west-2.compute.internal adminrouter.sh[4296]: 2018/01/31 02:15:10 [notice] 4305#0: *15199 [lua] ee.lua:57: auditlog(): type=audit timestamp=2018-01-31T02:15:10Z authorizer=adminrouter object=dcos:adminrouter:service:registry action=full result=allow reason="IAM PQ response" srcip=10.0.6.229 srcport=7950 request_uri=/service/registry/add uid=bootstrapuser while sending to client, client: 10.0.6.229, server: master.mesos, request: "POST /service/registry/add HTTP/1.1", host: "takirala-elasticl-p3zv024rg392-520521878.us-west-2.elb.amazonaws.com"

       

       

      2. Registry Service receives the request from AR. Registry service output:

      02:15:10.796 INFO [nioEventLoopGroup-3-4] com.mesosphere.http.BridgeHandler - HTTP request: DefaultHttpRequest(decodeResult: success, version: HTTP/1.1)
      POST /add HTTP/1.1
      Host: takirala-elasticl-p3zv024rg392-520521878.us-west-2.elb.amazonaws.com
      X-Real-IP: 10.0.6.229
      X-Forwarded-For: 10.0.6.229
      X-Forwarded-Proto: https
      Connection: upgrade
      Content-Length: 2536796417
      User-Agent: Go-http-client/1.1
      Accept: application/vnd.dcos.registry.add-response+json;charset=utf-8;version=v1
      Authorization: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOiJib290c3RyYXB1c2VyIiwiZXhwIjoxNTE3Nzk2MTExfQ.hLY8z2ow_fGuqiHfsK4HiYM8NdL1-9A-cuAbklAoRmXL7cpWkwBNRr5wvT9X_NgCi_95f5unU4iLbayw5iPL_6HtHaQVktsJ-C7M3DR96jHBPEiREpcSuMTy3rnmMvjawi7SWuU3yYw4SmVFRAeXQdmPUJLoJhm8OrSViJKt4oBDgPSewychCBY6rOYsuiBVN0sZsNuuQsgkJRKjgZJiRSBs0PAkyKxJhibUki5aAcIojiRJxzRXcZ0u2LTKPYlpTM8i9O3ufE1fvfFonlZuSuybALGBY7ABiM-ggNCigvS4gjt0UBDtpWKqSK9mrR7zYA77UCJRwbNFUAPgzuMTwQ
      Content-Type: application/vnd.dcos.universe.package+zip;version=v1

      3. File streaming from CLI -> AR -> Registry

      4. Registry responds:

      03:26:29.661 INFO [nioEventLoopGroup-3-4] com.mesosphere.http.BridgeHandler - HTTP response: DefaultHttpResponse(decodeResult: success, version: HTTP/1.1)
      HTTP/1.1 200 OK
      content-type: application/vnd.dcos.registry.add-response+json;charset=utf-8;version=v1
      content-length: 61

      5. AR Closes connection to Client.

      Jan 31 03:25:16 ip-10-0-4-87.us-west-2.compute.internal adminrouter.sh[4296]: 2018/01/31 03:25:16 [info] 4305#0: *15199 epoll_wait() reported that client prematurely closed connection, so upstream connection is closed too while reading response header from upstream, client: 10.0.6.229, server: master.mesos, request: "POST /service/registry/add HTTP/1.1", upstream: "https://10.0.1.251:25465/add", host: "takirala-elasticl-p3zv024rg392-520521878.us-west-2.elb.amazonaws.com"
      Jan 31 03:25:16 ip-10-0-4-87.us-west-2.compute.internal nginx[4305]: ip-10-0-4-87.us-west-2.compute.internal nginx: 10.0.6.229 - - [31/Jan/2018:03:25:16 +0000] "POST /service/registry/add HTTP/1.1" 499 0 "-" "Go-http-client/1.1"

       

      6. This is the output on the client end. I have tested this locally.

      ➜ dcos registry add --dcos-file=gestalt-framework-1.2.0.dcos --verbose
      Running DC/OS CLI command: dcos config show core.dcos_url
      Running DC/OS CLI command: dcos config show core.dcos_acs_token
      Running DC/OS CLI command: dcos config show core.ssl_verify
      [Debug] 2018/01/30 18:15:09 Request:
      POST /service/registry/add HTTP/1.1
      Host: takirala-elasticl-p3zv024rg392-520521878.us-west-2.elb.amazonaws.com
      User-Agent: Go-http-client/1.1
      Content-Length: 2536796417
      Accept: application/vnd.dcos.registry.add-response+json;charset=utf-8;version=v1
      Authorization: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOiJib290c3RyYXB1c2VyIiwiZXhwIjoxNTE3Nzk2MTExfQ.hLY8z2ow_fGuqiHfsK4HiYM8NdL1-9A-cuAbklAoRmXL7cpWkwBNRr5wvT9X_NgCi_95f5unU4iLbayw5iPL_6HtHaQVktsJ-C7M3DR96jHBPEiREpcSuMTy3rnmMvjawi7SWuU3yYw4SmVFRAeXQdmPUJLoJhm8OrSViJKt4oBDgPSewychCBY6rOYsuiBVN0sZsNuuQsgkJRKjgZJiRSBs0PAkyKxJhibUki5aAcIojiRJxzRXcZ0u2LTKPYlpTM8i9O3ufE1fvfFonlZuSuybALGBY7ABiM-ggNCigvS4gjt0UBDtpWKqSK9mrR7zYA77UCJRwbNFUAPgzuMTwQ
      Content-Type: application/vnd.dcos.universe.package+zip;version=v1
      Accept-Encoding: gzip
      
      HTTP POST Query for https://takirala-elasticl-p3zv024rg392-520521878.us-west-2.elb.amazonaws.com/service/registry/add failed: EOF
      - Is 'core.dcos_url' set correctly? Check 'dcos config show core.dcos_url'.
      - Is 'core.dcos_acs_token' set correctly? Run 'dcos auth login' to log in.

       Also, For the above I want to make sure the CLI is not the culprit. To do this, i made the same request the cli is making using curl and this is what I get (as expected)

      ➜ curl -k -H "Authorization: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOiJib290c3RyYXB1c2VyIiwiZXhwIjoxNTE3Nzk2MTExfQ.hLY8z2ow_fGuqiHfsK4HiYM8NdL1-9A-cuAbklAoRmXL7cpWkwBNRr5wvT9X_NgCi_95f5unU4iLbayw5iPL_6HtHaQVktsJ-C7M3DR96jHBPEiREpcSuMTy3rnmMvjawi7SWuU3yYw4SmVFRAeXQdmPUJLoJhm8OrSViJKt4oBDgPSewychCBY6rOYsuiBVN0sZsNuuQsgkJRKjgZJiRSBs0PAkyKxJhibUki5aAcIojiRJxzRXcZ0u2LTKPYlpTM8i9O3ufE1fvfFonlZuSuybALGBY7ABiM-ggNCigvS4gjt0UBDtpWKqSK9mrR7zYA77UCJRwbNFUAPgzuMTwQ" -H "Content-Length: 2536796417" -i --request POST --data-binary "@gestalt-framework-1.2.0.dcos" https://takirala-elasticl-p3zv024rg392-520521878.us-west-2.elb.amazonaws.com/service/registry/add
      HTTP/1.1 100 Continue
      
      
      
      
      
      
      
      curl: (52) Empty reply from server

      As the registry-services responds to AR with 200 OK, the package is being added and everything is working as expected but this error shows up on client end.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                prozlach Pawel Rozlach
                Reporter:
                jlariosmurillo Jesus E. Larios Murillo (Inactive)
                Team:
                DELETE Security Team
                Watchers:
                Artem Harutyunyan (Inactive), Jan-Philip Gehrcke (Inactive), José Armando García Sancio (Inactive), Marco Monaco, Pawel Rozlach, Tarun Gupta Akirala
              • Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Zendesk Support

                    NextupJiraPlusStatus

                    Error rendering 'slack.nextup.jira:nextup-jira-plus-status'. Please contact your JIRA administrators.