)]}'
{"specs/rocky/oslo-healthcheck-middleware.rst":[{"author":{"_account_id":2472,"name":"Doug Hellmann","email":"dhellmann@redhat.com","username":"doug-hellmann"},"change_message_id":"1dad46b1eb1d0505fdcd9613dcb1d4f8dfc73a1d","unresolved":false,"context_lines":[{"line_number":95,"context_line":""},{"line_number":96,"context_line":"It should also have a ``view_name`` variable that will used to distinguish the test"},{"line_number":97,"context_line":"from other tests, and a ``timeout`` variable that indicates the oldest \"HEALTHY\" "},{"line_number":98,"context_line":"result, after which time, the test result will be considered \"UNHEALTHY\"."},{"line_number":99,"context_line":""},{"line_number":100,"context_line":"When a request to the middleware arrives, it looks up the list of test results,"},{"line_number":101,"context_line":"and compiles them into the responses below."}],"source_content_type":"text/x-rst","patch_set":1,"id":"9f91af0f_bab6cff9","line":98,"updated":"2018-01-05 18:18:13.000000000","message":"It seems like we could use the plugin name as the view_name, couldn\u0027t we?\n\nWill the plugin track the timeout value, or will the caller?","commit_id":"fc626c1fb47c8e09041b05592694ce78d93e225d"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"83e2d34b2001450348e19faa28902c66a263baeb","unresolved":false,"context_lines":[{"line_number":95,"context_line":""},{"line_number":96,"context_line":"It should also have a ``view_name`` variable that will used to distinguish the test"},{"line_number":97,"context_line":"from other tests, and a ``timeout`` variable that indicates the oldest \"HEALTHY\" "},{"line_number":98,"context_line":"result, after which time, the test result will be considered \"UNHEALTHY\"."},{"line_number":99,"context_line":""},{"line_number":100,"context_line":"When a request to the middleware arrives, it looks up the list of test results,"},{"line_number":101,"context_line":"and compiles them into the responses below."}],"source_content_type":"text/x-rst","patch_set":1,"id":"9f91af0f_eadb0879","line":98,"in_reply_to":"9f91af0f_bab6cff9","updated":"2018-01-08 14:38:27.000000000","message":"\u003e It seems like we could use the plugin name as the view_name,\n \u003e couldn\u0027t we?\n\nYeap - we could.\n\n \u003e Will the plugin track the timeout value, or will the caller?\n\nI would imagine the middleware will do something like checking the last results timestamp, compare it to current time - timeout and decide if it has timed out.","commit_id":"fc626c1fb47c8e09041b05592694ce78d93e225d"},{"author":{"_account_id":2472,"name":"Doug Hellmann","email":"dhellmann@redhat.com","username":"doug-hellmann"},"change_message_id":"1dad46b1eb1d0505fdcd9613dcb1d4f8dfc73a1d","unresolved":false,"context_lines":[{"line_number":121,"context_line":"      {"},{"line_number":122,"context_line":"        \"status\": \"HEALTHY\","},{"line_number":123,"context_line":"        \"test_failures\": 0,"},{"line_number":124,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":125,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":126,"context_line":"        \"service\": \"designate-api\","},{"line_number":127,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","}],"source_content_type":"text/x-rst","patch_set":1,"id":"9f91af0f_3aaddf87","line":124,"updated":"2018-01-05 18:18:13.000000000","message":"Is the top level status determined by aggregating the other values in some way? Should the middleware do that interpretation, or should the consuming application?","commit_id":"fc626c1fb47c8e09041b05592694ce78d93e225d"},{"author":{"_account_id":19159,"name":"Ifat Afek","email":"ifat.afek@nokia.com","username":"ifat_afek"},"change_message_id":"01e0a7298e13c6d813fc33ca1c6fbfadc93aaf23","unresolved":false,"context_lines":[{"line_number":121,"context_line":"      {"},{"line_number":122,"context_line":"        \"status\": \"HEALTHY\","},{"line_number":123,"context_line":"        \"test_failures\": 0,"},{"line_number":124,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":125,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":126,"context_line":"        \"service\": \"designate-api\","},{"line_number":127,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","}],"source_content_type":"text/x-rst","patch_set":1,"id":"9f91af0f_3a3a7acb","line":124,"in_reply_to":"9f91af0f_1ffbe878","updated":"2018-01-09 16:33:10.000000000","message":"I completely agree that the severity depends on the deployment. But I\u0027m concerned that there might be cases where a single service is always down, and it\u0027s not considered a severe problem in the specific deployment, but it will cause the health check to always return \u0027unhealthy\u0027. Then, if another service fails later on, nobody will notice the difference.\n\nAnyway, I think we can leave this use case to future development.","commit_id":"fc626c1fb47c8e09041b05592694ce78d93e225d"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"83e2d34b2001450348e19faa28902c66a263baeb","unresolved":false,"context_lines":[{"line_number":121,"context_line":"      {"},{"line_number":122,"context_line":"        \"status\": \"HEALTHY\","},{"line_number":123,"context_line":"        \"test_failures\": 0,"},{"line_number":124,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":125,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":126,"context_line":"        \"service\": \"designate-api\","},{"line_number":127,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","}],"source_content_type":"text/x-rst","patch_set":1,"id":"9f91af0f_4a39d4a1","line":124,"in_reply_to":"9f91af0f_3aaddf87","updated":"2018-01-08 14:38:27.000000000","message":"It is \"UNHEALTHY\" if any test is marked as \"UNHEALTHY\", (see line 103) - and the middleware should do the aggregation.","commit_id":"fc626c1fb47c8e09041b05592694ce78d93e225d"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"7a8a0a5a66e73a95b8bed1f650faebe7e48a3640","unresolved":false,"context_lines":[{"line_number":121,"context_line":"      {"},{"line_number":122,"context_line":"        \"status\": \"HEALTHY\","},{"line_number":123,"context_line":"        \"test_failures\": 0,"},{"line_number":124,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":125,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":126,"context_line":"        \"service\": \"designate-api\","},{"line_number":127,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","}],"source_content_type":"text/x-rst","patch_set":1,"id":"9f91af0f_1ffbe878","line":124,"in_reply_to":"9f91af0f_493a22cc","updated":"2018-01-09 16:21:14.000000000","message":"Well, this will be running in front of each service, so will be a service specific result.\n\nThis will actually be more granular than \"nova\" - this will be the health of the specific \"nova-api\" / \"nova-metadata-api\" instance that is being called.\n\nSee the section below for getting the results of more services (but this will take longer to get developed, I think for the initial version we will focus on a single service, and add the ``/heathcheck/v1/services/`` endpoint in an add on release.\n\nI think we should not try and dictate what severity *we* think the failure is, as aspiers pointed out, that is very deployment specific.","commit_id":"fc626c1fb47c8e09041b05592694ce78d93e225d"},{"author":{"_account_id":19159,"name":"Ifat Afek","email":"ifat.afek@nokia.com","username":"ifat_afek"},"change_message_id":"50e0b8b040d35eb63402ad3fb65409dbd4ee6f06","unresolved":false,"context_lines":[{"line_number":121,"context_line":"      {"},{"line_number":122,"context_line":"        \"status\": \"HEALTHY\","},{"line_number":123,"context_line":"        \"test_failures\": 0,"},{"line_number":124,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":125,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":126,"context_line":"        \"service\": \"designate-api\","},{"line_number":127,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","}],"source_content_type":"text/x-rst","patch_set":1,"id":"9f91af0f_493a22cc","line":124,"in_reply_to":"9f91af0f_4a39d4a1","updated":"2018-01-09 15:05:16.000000000","message":"Are all tests of the same importance? e.g. unhealthy Nova might be more severe than other unhealthy services... \nmaybe we should consider severity for the \u0027unhealthy\u0027 status.","commit_id":"fc626c1fb47c8e09041b05592694ce78d93e225d"},{"author":{"_account_id":19298,"name":"Nicolas Bock","email":"nicolas.bock@canonical.com","username":"nicolasbock"},"change_message_id":"7be8a7671a465f87682eea0db8092468f69d56ed","unresolved":false,"context_lines":[{"line_number":34,"context_line":"Proposed adoption model/plan"},{"line_number":35,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":36,"context_line":""},{"line_number":37,"context_line":"Initially this will be worked through using Designate, and as the model"},{"line_number":38,"context_line":"solidifies we will start encouraging other projects to use it."},{"line_number":39,"context_line":""},{"line_number":40,"context_line":"Initial Design"}],"source_content_type":"text/x-rst","patch_set":3,"id":"9f91af0f_903847a0","line":37,"updated":"2018-01-09 17:36:28.000000000","message":"Could you elaborate what you mean with \"work through using Designate\"? Will the code be based on Designate? Will that add a dependency on Designate?","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"b77473e29d408ff0cc97d226a81efd646f4f3f9d","unresolved":false,"context_lines":[{"line_number":34,"context_line":"Proposed adoption model/plan"},{"line_number":35,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":36,"context_line":""},{"line_number":37,"context_line":"Initially this will be worked through using Designate, and as the model"},{"line_number":38,"context_line":"solidifies we will start encouraging other projects to use it."},{"line_number":39,"context_line":""},{"line_number":40,"context_line":"Initial Design"}],"source_content_type":"text/x-rst","patch_set":3,"id":"9f91af0f_b0c22b58","line":37,"in_reply_to":"9f91af0f_903847a0","updated":"2018-01-09 17:42:01.000000000","message":"No, no dependency on Designate - just that as I know that code base quite well, it would be easier to develop the prototype against Designate. (know where and what to test, and loading the middleware is easier than trying on another project that I may not know as well.)\n\nIt will be an independent library, and Designate will interact with it like any other service.","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":19298,"name":"Nicolas Bock","email":"nicolas.bock@canonical.com","username":"nicolasbock"},"change_message_id":"e9fba497ba3ce863ffb31d76efb27a628f293d77","unresolved":false,"context_lines":[{"line_number":34,"context_line":"Proposed adoption model/plan"},{"line_number":35,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":36,"context_line":""},{"line_number":37,"context_line":"Initially this will be worked through using Designate, and as the model"},{"line_number":38,"context_line":"solidifies we will start encouraging other projects to use it."},{"line_number":39,"context_line":""},{"line_number":40,"context_line":"Initial Design"}],"source_content_type":"text/x-rst","patch_set":3,"id":"9f91af0f_50595f8b","line":37,"in_reply_to":"9f91af0f_b0c22b58","updated":"2018-01-09 17:53:40.000000000","message":"Thanks, that makes sense.","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":10239,"name":"Dmitry Tantsur","email":"dtantsur@protonmail.com","username":"dtantsur"},"change_message_id":"a65c64c5e7fa7a2c991076a97157679281a2e6e5","unresolved":false,"context_lines":[{"line_number":110,"context_line":"of the middleware, the status of the service being called, and its test"},{"line_number":111,"context_line":"results."},{"line_number":112,"context_line":""},{"line_number":113,"context_line":".. http:get:: /healthcheck/v1"},{"line_number":114,"context_line":""},{"line_number":115,"context_line":"  **Example response**:"},{"line_number":116,"context_line":""}],"source_content_type":"text/x-rst","patch_set":3,"id":"7f96bb07_a76881f4","line":113,"range":{"start_line":113,"start_character":0,"end_line":113,"end_character":29},"updated":"2018-01-18 16:18:01.000000000","message":"Just to clarify my understanding. This endpoint is added to the root of a service, right? So e.g. for Ironic it will be https://openstack.example.com/baremetal/v1/healthcheck/v1? Or https://openstack.example.com/baremetal/healthcheck/v1? It\u0027s worth clarifying.\n\nIf my understanding is right, how will this middleware play with microversions? Should its API be 1. versioned separately, 2. versioned with the main service API, 3. not versioned?","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"0342933e9807c6a2fe649c0a1a38f68effada4b8","unresolved":false,"context_lines":[{"line_number":110,"context_line":"of the middleware, the status of the service being called, and its test"},{"line_number":111,"context_line":"results."},{"line_number":112,"context_line":""},{"line_number":113,"context_line":".. http:get:: /healthcheck/v1"},{"line_number":114,"context_line":""},{"line_number":115,"context_line":"  **Example response**:"},{"line_number":116,"context_line":""}],"source_content_type":"text/x-rst","patch_set":3,"id":"7f96bb07_6adec03e","line":113,"range":{"start_line":113,"start_character":0,"end_line":113,"end_character":29},"in_reply_to":"7f96bb07_a76881f4","updated":"2018-01-18 16:34:11.000000000","message":"It would be at the *unversioned* root of the API.\n\neg. https://openstack.example.com/baremetal/healthcheck/v1\n\nThe API should be versioned separately, as the version is for the healthcheck version, which will be cross project (and hopefully consistent across projects in a deployment)","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":10239,"name":"Dmitry Tantsur","email":"dtantsur@protonmail.com","username":"dtantsur"},"change_message_id":"a65c64c5e7fa7a2c991076a97157679281a2e6e5","unresolved":false,"context_lines":[{"line_number":124,"context_line":"        \"test_failures\": 0,"},{"line_number":125,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":126,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":127,"context_line":"        \"service\": \"designate-api\","},{"line_number":128,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","},{"line_number":129,"context_line":"        \"hostname\": \"api-0001-us1az1.designate.example.com\","},{"line_number":130,"context_line":"        \"created_at\": \"2018-01-01T12:16:29.511Z\","}],"source_content_type":"text/x-rst","patch_set":3,"id":"7f96bb07_a7364102","line":127,"range":{"start_line":127,"start_character":0,"end_line":127,"end_character":35},"updated":"2018-01-18 16:18:01.000000000","message":"How does it play with /services subresource? Shouldn\u0027t we report only the overall health in the root endpoint? Is it even the same \"service\" as in /services? Maybe we should call \"service\" something like \"baremetal\", while then going into /components (\"ironic-api\", \"ironic-conductor-\u003chostname\u003e\" or whatever)?","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"0342933e9807c6a2fe649c0a1a38f68effada4b8","unresolved":false,"context_lines":[{"line_number":124,"context_line":"        \"test_failures\": 0,"},{"line_number":125,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":126,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":127,"context_line":"        \"service\": \"designate-api\","},{"line_number":128,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","},{"line_number":129,"context_line":"        \"hostname\": \"api-0001-us1az1.designate.example.com\","},{"line_number":130,"context_line":"        \"created_at\": \"2018-01-01T12:16:29.511Z\","}],"source_content_type":"text/x-rst","patch_set":3,"id":"7f96bb07_ead3b0ff","line":127,"range":{"start_line":127,"start_character":0,"end_line":127,"end_character":35},"in_reply_to":"7f96bb07_a7364102","updated":"2018-01-18 16:34:11.000000000","message":"The \"/components\" endpoint is a \"nice to have\" add on in my mind.\n\nIdeally *all* components will be running this middleware, even if they do not have an API, so in my mind the root should be the status of the current process, not the service as a whole.","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":10239,"name":"Dmitry Tantsur","email":"dtantsur@protonmail.com","username":"dtantsur"},"change_message_id":"a65c64c5e7fa7a2c991076a97157679281a2e6e5","unresolved":false,"context_lines":[{"line_number":129,"context_line":"        \"hostname\": \"api-0001-us1az1.designate.example.com\","},{"line_number":130,"context_line":"        \"created_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":131,"context_line":"        \"heartbeated_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":132,"context_line":"        \"tests\": {"},{"line_number":133,"context_line":"          \"message_queue_connection\": {"},{"line_number":134,"context_line":"            \"status\": \"HEALTHY\","},{"line_number":135,"context_line":"            \"tested_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":136,"context_line":"            \"message\": \"Service is connecting to the queue normally\""},{"line_number":137,"context_line":"          },"},{"line_number":138,"context_line":"          \"keystone_connection\": {"},{"line_number":139,"context_line":"            \"status\": \"HEALTHY\","},{"line_number":140,"context_line":"            \"tested_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":141,"context_line":"            \"message\": \"Service is connecting to Keystone normally\""},{"line_number":142,"context_line":"          },"},{"line_number":143,"context_line":"          \"other_plugin_test\": {"},{"line_number":144,"context_line":"            \"status\": \"HEALTHY\","},{"line_number":145,"context_line":"            \"tested_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":146,"context_line":"            \"message\": \"Service is foobarring ++ today\""},{"line_number":147,"context_line":"          }"},{"line_number":148,"context_line":"        },"},{"line_number":149,"context_line":"        \"links\": {"},{"line_number":150,"context_line":"          \"self\": \"https://127.0.0.1/dns/healthcheck/v1\","},{"line_number":151,"context_line":"          \"services\": \"https://127.0.0.1/dns/healthcheck/v1/services\","}],"source_content_type":"text/x-rst","patch_set":3,"id":"7f96bb07_c746edc0","line":148,"range":{"start_line":132,"start_character":0,"end_line":148,"end_character":10},"updated":"2018-01-18 16:18:01.000000000","message":"(mostly thinking aloud) I wonder if we should create a /tests subresource to allow easier expansion in the future. I\u0027m not sure how far we want to develop this API.","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"0342933e9807c6a2fe649c0a1a38f68effada4b8","unresolved":false,"context_lines":[{"line_number":129,"context_line":"        \"hostname\": \"api-0001-us1az1.designate.example.com\","},{"line_number":130,"context_line":"        \"created_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":131,"context_line":"        \"heartbeated_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":132,"context_line":"        \"tests\": {"},{"line_number":133,"context_line":"          \"message_queue_connection\": {"},{"line_number":134,"context_line":"            \"status\": \"HEALTHY\","},{"line_number":135,"context_line":"            \"tested_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":136,"context_line":"            \"message\": \"Service is connecting to the queue normally\""},{"line_number":137,"context_line":"          },"},{"line_number":138,"context_line":"          \"keystone_connection\": {"},{"line_number":139,"context_line":"            \"status\": \"HEALTHY\","},{"line_number":140,"context_line":"            \"tested_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":141,"context_line":"            \"message\": \"Service is connecting to Keystone normally\""},{"line_number":142,"context_line":"          },"},{"line_number":143,"context_line":"          \"other_plugin_test\": {"},{"line_number":144,"context_line":"            \"status\": \"HEALTHY\","},{"line_number":145,"context_line":"            \"tested_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":146,"context_line":"            \"message\": \"Service is foobarring ++ today\""},{"line_number":147,"context_line":"          }"},{"line_number":148,"context_line":"        },"},{"line_number":149,"context_line":"        \"links\": {"},{"line_number":150,"context_line":"          \"self\": \"https://127.0.0.1/dns/healthcheck/v1\","},{"line_number":151,"context_line":"          \"services\": \"https://127.0.0.1/dns/healthcheck/v1/services\","}],"source_content_type":"text/x-rst","patch_set":3,"id":"7f96bb07_caaa746f","line":148,"range":{"start_line":132,"start_character":0,"end_line":148,"end_character":10},"in_reply_to":"7f96bb07_c746edc0","updated":"2018-01-18 16:34:11.000000000","message":"I think right now we can avoid /tests.\n\nI initially had a /tests endpoint, but decided to leave it out, to allow us to actually get an MVP out, and used.\n\nI think a v2 may exist in the future, but we need to see real world usage / problems first.","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":10239,"name":"Dmitry Tantsur","email":"dtantsur@protonmail.com","username":"dtantsur"},"change_message_id":"a65c64c5e7fa7a2c991076a97157679281a2e6e5","unresolved":false,"context_lines":[{"line_number":146,"context_line":"            \"message\": \"Service is foobarring ++ today\""},{"line_number":147,"context_line":"          }"},{"line_number":148,"context_line":"        },"},{"line_number":149,"context_line":"        \"links\": {"},{"line_number":150,"context_line":"          \"self\": \"https://127.0.0.1/dns/healthcheck/v1\","},{"line_number":151,"context_line":"          \"services\": \"https://127.0.0.1/dns/healthcheck/v1/services\","},{"line_number":152,"context_line":"        }"},{"line_number":153,"context_line":"      }"},{"line_number":154,"context_line":""},{"line_number":155,"context_line":"   :statuscode 200: Service is running correctly"}],"source_content_type":"text/x-rst","patch_set":3,"id":"7f96bb07_6701196a","line":152,"range":{"start_line":149,"start_character":0,"end_line":152,"end_character":9},"updated":"2018-01-18 16:18:01.000000000","message":"Could you please get closer to the API-SIG guideline? http://specs.openstack.org/openstack/api-wg/guidelines/links.html","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"0342933e9807c6a2fe649c0a1a38f68effada4b8","unresolved":false,"context_lines":[{"line_number":146,"context_line":"            \"message\": \"Service is foobarring ++ today\""},{"line_number":147,"context_line":"          }"},{"line_number":148,"context_line":"        },"},{"line_number":149,"context_line":"        \"links\": {"},{"line_number":150,"context_line":"          \"self\": \"https://127.0.0.1/dns/healthcheck/v1\","},{"line_number":151,"context_line":"          \"services\": \"https://127.0.0.1/dns/healthcheck/v1/services\","},{"line_number":152,"context_line":"        }"},{"line_number":153,"context_line":"      }"},{"line_number":154,"context_line":""},{"line_number":155,"context_line":"   :statuscode 200: Service is running correctly"}],"source_content_type":"text/x-rst","patch_set":3,"id":"7f96bb07_6ab7a0f6","line":152,"range":{"start_line":149,"start_character":0,"end_line":152,"end_character":9},"in_reply_to":"7f96bb07_6701196a","updated":"2018-01-18 16:34:11.000000000","message":"sure, that was my bad. will rev with this updated.","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":10239,"name":"Dmitry Tantsur","email":"dtantsur@protonmail.com","username":"dtantsur"},"change_message_id":"a65c64c5e7fa7a2c991076a97157679281a2e6e5","unresolved":false,"context_lines":[{"line_number":176,"context_line":"   This would allow for non HTTP exposed services to be healthchecked alongside"},{"line_number":177,"context_line":"   HTTP exposed ones."},{"line_number":178,"context_line":""},{"line_number":179,"context_line":".. http:get:: /healthcheck/v1/services"},{"line_number":180,"context_line":""},{"line_number":181,"context_line":"  **Example response**:"},{"line_number":182,"context_line":""}],"source_content_type":"text/x-rst","patch_set":3,"id":"7f96bb07_4a074419","line":179,"range":{"start_line":179,"start_character":30,"end_line":179,"end_character":38},"updated":"2018-01-18 16:18:01.000000000","message":"So, as I mentioned above \"service\" is a bit too overloaded. We have a service catalog talking about the Compute service, etc. Should we maybe use something else, e.g. \"components\"?","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":10239,"name":"Dmitry Tantsur","email":"dtantsur@protonmail.com","username":"dtantsur"},"change_message_id":"a65c64c5e7fa7a2c991076a97157679281a2e6e5","unresolved":false,"context_lines":[{"line_number":197,"context_line":"            \"hostname\": \"central-0001-us1az1.designate.example.com\","},{"line_number":198,"context_line":"            \"created_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":199,"context_line":"            \"heartbeated_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":200,"context_line":"            \"links\": {"},{"line_number":201,"context_line":"              \"self\": \"https://127.0.0.1/dns/healthcheck/v1/services/a86dba58-0043-4cc6-a1bb-69d5e86f3ca3\""},{"line_number":202,"context_line":"            }"},{"line_number":203,"context_line":"          },"},{"line_number":204,"context_line":"          {"},{"line_number":205,"context_line":"            \"status\": \"HEALTHY\","}],"source_content_type":"text/x-rst","patch_set":3,"id":"7f96bb07_ea15f03d","line":202,"range":{"start_line":200,"start_character":0,"end_line":202,"end_character":13},"updated":"2018-01-18 16:18:01.000000000","message":"ditto re links","commit_id":"adb1226f6401c797abbb030fe066395e9d13a6f5"},{"author":{"_account_id":21414,"name":"Yujun Zhang","email":"zhang.yujunz@zte.com.cn","username":"yujunz"},"change_message_id":"4eb62fe1f63fc2e2e1d03bd127723dde46e326fc","unresolved":false,"context_lines":[{"line_number":26,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":27,"context_line":""},{"line_number":28,"context_line":"The `Healthcheck`_ middleware in the current oslo.middleware repo exists, and it"},{"line_number":29,"context_line":"is currently used by Glance, Glare, Keystone, Manila and Swift."},{"line_number":30,"context_line":""},{"line_number":31,"context_line":"However this middleware is limited in scope, and expanding its capabilities"},{"line_number":32,"context_line":"would cause compatibility issues for current users."}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_1907ca1e","line":29,"range":{"start_line":29,"start_character":57,"end_line":29,"end_character":62},"updated":"2018-01-24 07:43:19.000000000","message":"Also by Vitrage","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"58c531dc5a48cd62f9349748a6009c23b2832a8a","unresolved":false,"context_lines":[{"line_number":46,"context_line":""},{"line_number":47,"context_line":"* A simple piece of middleware (based on oslo.middleware) that can read test"},{"line_number":48,"context_line":"  results"},{"line_number":49,"context_line":"* A periodic test runner (that loads the custom tests)"},{"line_number":50,"context_line":"* A base test plugin, that can be used to create custom tests per service."},{"line_number":51,"context_line":""},{"line_number":52,"context_line":"Splitting the tests and the reporting allows for more in-depth tests that may"}],"source_content_type":"text/x-rst","patch_set":4,"id":"5f93b717_442144fb","line":49,"updated":"2018-01-25 12:58:03.000000000","message":"This would cause issues for a lot of healthchecking systems (e.g. kubernetes liveness probes) that only support a single call to decide if a system is OK.\n\nThe point of this framework is to be consistant, and I do not believe that we should be allowing differences like this from the get go - I think being opinionated from the start is better.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":46,"context_line":""},{"line_number":47,"context_line":"* A simple piece of middleware (based on oslo.middleware) that can read test"},{"line_number":48,"context_line":"  results"},{"line_number":49,"context_line":"* A periodic test runner (that loads the custom tests)"},{"line_number":50,"context_line":"* A base test plugin, that can be used to create custom tests per service."},{"line_number":51,"context_line":""},{"line_number":52,"context_line":"Splitting the tests and the reporting allows for more in-depth tests that may"}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_df62dfd8","line":49,"range":{"start_line":49,"start_character":4,"end_line":49,"end_character":12},"updated":"2018-01-22 20:01:47.000000000","message":"definitely a good idea to split tests and reporting, but making the tests periodic may not be the best answer, at least not for everyone. I might not want the tests to run at all until I ask for them to run. And I might want to ask for them to run *now* rather than have to wait for the period to come around again.\n\nI.e., have separate API calls for \"run the tests\" vs. \"get me the latest results\". If I call the latter and the results are older than I\u0027d like, I can call the former and then ask for the results again.\n\nIf someone does want periodic, they could always setup a cron job to call the \"run the tests\" API periodically.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"58c531dc5a48cd62f9349748a6009c23b2832a8a","unresolved":false,"context_lines":[{"line_number":50,"context_line":"* A base test plugin, that can be used to create custom tests per service."},{"line_number":51,"context_line":""},{"line_number":52,"context_line":"Splitting the tests and the reporting allows for more in-depth tests that may"},{"line_number":53,"context_line":"not want to run on very call (to try and avoid DoS attacks)."},{"line_number":54,"context_line":""},{"line_number":55,"context_line":"A few builtin tests should be provided as well - oslo.messaging / db / keystone"},{"line_number":56,"context_line":"would be good examples."}],"source_content_type":"text/x-rst","patch_set":4,"id":"5f93b717_641ec0bb","line":53,"updated":"2018-01-25 12:58:03.000000000","message":"Ack","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":50,"context_line":"* A base test plugin, that can be used to create custom tests per service."},{"line_number":51,"context_line":""},{"line_number":52,"context_line":"Splitting the tests and the reporting allows for more in-depth tests that may"},{"line_number":53,"context_line":"not want to run on very call (to try and avoid DoS attacks)."},{"line_number":54,"context_line":""},{"line_number":55,"context_line":"A few builtin tests should be provided as well - oslo.messaging / db / keystone"},{"line_number":56,"context_line":"would be good examples."}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_7c487986","line":53,"range":{"start_line":53,"start_character":19,"end_line":53,"end_character":23},"updated":"2018-01-22 20:01:47.000000000","message":"every*","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"58c531dc5a48cd62f9349748a6009c23b2832a8a","unresolved":false,"context_lines":[{"line_number":52,"context_line":"Splitting the tests and the reporting allows for more in-depth tests that may"},{"line_number":53,"context_line":"not want to run on very call (to try and avoid DoS attacks)."},{"line_number":54,"context_line":""},{"line_number":55,"context_line":"A few builtin tests should be provided as well - oslo.messaging / db / keystone"},{"line_number":56,"context_line":"would be good examples."},{"line_number":57,"context_line":""},{"line_number":58,"context_line":"These plugins could be loaded dynamically by reading entrypoints which loads"}],"source_content_type":"text/x-rst","patch_set":4,"id":"5f93b717_0479fcfa","line":55,"updated":"2018-01-25 12:58:03.000000000","message":"Ack","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":52,"context_line":"Splitting the tests and the reporting allows for more in-depth tests that may"},{"line_number":53,"context_line":"not want to run on very call (to try and avoid DoS attacks)."},{"line_number":54,"context_line":""},{"line_number":55,"context_line":"A few builtin tests should be provided as well - oslo.messaging / db / keystone"},{"line_number":56,"context_line":"would be good examples."},{"line_number":57,"context_line":""},{"line_number":58,"context_line":"These plugins could be loaded dynamically by reading entrypoints which loads"}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_5ff24f08","line":55,"range":{"start_line":55,"start_character":49,"end_line":55,"end_character":79},"updated":"2018-01-22 20:01:47.000000000","message":"please detail what you are suggesting would be tested for these","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":62,"context_line":"This allows for flexibility for services, without causing tooling to be updated"},{"line_number":63,"context_line":"for each new service."},{"line_number":64,"context_line":""},{"line_number":65,"context_line":"These results will be only be stored locally, and not as a timeseries set of"},{"line_number":66,"context_line":"data - there will always only be one set of results per test, along with a"},{"line_number":67,"context_line":"timestamp of when the result was updated. `designate service-status`_ is an"},{"line_number":68,"context_line":"example of this."}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_bfd4b348","line":65,"range":{"start_line":65,"start_character":19,"end_line":65,"end_character":29},"updated":"2018-01-22 20:01:47.000000000","message":"only be*","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"58c531dc5a48cd62f9349748a6009c23b2832a8a","unresolved":false,"context_lines":[{"line_number":78,"context_line":"I see (initially) 3 statuses:"},{"line_number":79,"context_line":""},{"line_number":80,"context_line":"* HEALTHY"},{"line_number":81,"context_line":"* UNHEALTHY"},{"line_number":82,"context_line":"* INITIALISING"},{"line_number":83,"context_line":""},{"line_number":84,"context_line":"Test Plugin"}],"source_content_type":"text/x-rst","patch_set":4,"id":"5f93b717_247e78e4","line":81,"updated":"2018-01-25 12:58:03.000000000","message":"A lot of the use for this will be automated systems like kubernetes or vitrage - we want to provide a simple RED / GREEN (UNHEALTHY / HEALTHY) status.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":78,"context_line":"I see (initially) 3 statuses:"},{"line_number":79,"context_line":""},{"line_number":80,"context_line":"* HEALTHY"},{"line_number":81,"context_line":"* UNHEALTHY"},{"line_number":82,"context_line":"* INITIALISING"},{"line_number":83,"context_line":""},{"line_number":84,"context_line":"Test Plugin"}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_9f13f7b8","line":81,"range":{"start_line":81,"start_character":2,"end_line":81,"end_character":11},"updated":"2018-01-22 20:01:47.000000000","message":"I would suggest WARNING in place of (or at least in addition to) UNHEALTHY. It\u0027s a little less judgemental, more \"admin, take a look and determine yourself whether this is ok\".","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":21414,"name":"Yujun Zhang","email":"zhang.yujunz@zte.com.cn","username":"yujunz"},"change_message_id":"4eb62fe1f63fc2e2e1d03bd127723dde46e326fc","unresolved":false,"context_lines":[{"line_number":94,"context_line":"       \"message\": \"Service is foobarring ++ today\""},{"line_number":95,"context_line":"    }"},{"line_number":96,"context_line":""},{"line_number":97,"context_line":"It should also have a ``view_name`` variable that will used to distinguish the test"},{"line_number":98,"context_line":"from other tests, and a ``timeout`` variable that indicates the oldest \"HEALTHY\""},{"line_number":99,"context_line":"result, after which time, the test result will be considered \"UNHEALTHY\"."},{"line_number":100,"context_line":""}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_74a933db","line":97,"range":{"start_line":97,"start_character":24,"end_line":97,"end_character":33},"updated":"2018-01-24 07:43:19.000000000","message":"Not seen in the example response. What exactly is it?","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":94,"context_line":"       \"message\": \"Service is foobarring ++ today\""},{"line_number":95,"context_line":"    }"},{"line_number":96,"context_line":""},{"line_number":97,"context_line":"It should also have a ``view_name`` variable that will used to distinguish the test"},{"line_number":98,"context_line":"from other tests, and a ``timeout`` variable that indicates the oldest \"HEALTHY\""},{"line_number":99,"context_line":"result, after which time, the test result will be considered \"UNHEALTHY\"."},{"line_number":100,"context_line":""}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_3f450316","line":97,"range":{"start_line":97,"start_character":50,"end_line":97,"end_character":55},"updated":"2018-01-22 20:01:47.000000000","message":"will be*","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"58c531dc5a48cd62f9349748a6009c23b2832a8a","unresolved":false,"context_lines":[{"line_number":95,"context_line":"    }"},{"line_number":96,"context_line":""},{"line_number":97,"context_line":"It should also have a ``view_name`` variable that will used to distinguish the test"},{"line_number":98,"context_line":"from other tests, and a ``timeout`` variable that indicates the oldest \"HEALTHY\""},{"line_number":99,"context_line":"result, after which time, the test result will be considered \"UNHEALTHY\"."},{"line_number":100,"context_line":""},{"line_number":101,"context_line":"When a request to the middleware arrives, it looks up the list of test results,"}],"source_content_type":"text/x-rst","patch_set":4,"id":"5f93b717_c48234c9","line":98,"updated":"2018-01-25 12:58:03.000000000","message":"As said here, the timeout will be used if a test has not reported a result in a long time. This time\nis defined on a per test basis by the timeout variable.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":21414,"name":"Yujun Zhang","email":"zhang.yujunz@zte.com.cn","username":"yujunz"},"change_message_id":"4eb62fe1f63fc2e2e1d03bd127723dde46e326fc","unresolved":false,"context_lines":[{"line_number":95,"context_line":"    }"},{"line_number":96,"context_line":""},{"line_number":97,"context_line":"It should also have a ``view_name`` variable that will used to distinguish the test"},{"line_number":98,"context_line":"from other tests, and a ``timeout`` variable that indicates the oldest \"HEALTHY\""},{"line_number":99,"context_line":"result, after which time, the test result will be considered \"UNHEALTHY\"."},{"line_number":100,"context_line":""},{"line_number":101,"context_line":"When a request to the middleware arrives, it looks up the list of test results,"}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_34b3ab4f","line":98,"range":{"start_line":98,"start_character":26,"end_line":98,"end_character":33},"updated":"2018-01-24 07:43:19.000000000","message":"Not seen in the example response either. Could you please elaborate what is the expected behavior?","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":95,"context_line":"    }"},{"line_number":96,"context_line":""},{"line_number":97,"context_line":"It should also have a ``view_name`` variable that will used to distinguish the test"},{"line_number":98,"context_line":"from other tests, and a ``timeout`` variable that indicates the oldest \"HEALTHY\""},{"line_number":99,"context_line":"result, after which time, the test result will be considered \"UNHEALTHY\"."},{"line_number":100,"context_line":""},{"line_number":101,"context_line":"When a request to the middleware arrives, it looks up the list of test results,"},{"line_number":102,"context_line":"and compiles them into the responses below."}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_7f114b04","line":99,"range":{"start_line":98,"start_character":22,"end_line":99,"end_character":73},"updated":"2018-01-22 20:01:47.000000000","message":"I\u0027m not following your meaning here... can you clarify?","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":110,"context_line":"of the middleware, the status of the service being called, and its test"},{"line_number":111,"context_line":"results."},{"line_number":112,"context_line":""},{"line_number":113,"context_line":"This will be at the **unversioned** root of a service e.g. "},{"line_number":114,"context_line":"``https://openstack.example.com/dns/healthcheck/v1``."},{"line_number":115,"context_line":""},{"line_number":116,"context_line":"It will be versioned completely separately from the services API, as the "}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_bfa353b1","line":113,"range":{"start_line":113,"start_character":58,"end_line":113,"end_character":59},"updated":"2018-01-22 20:01:47.000000000","message":"pep8: whitespace on end of line here and lines 116 \u0026 118.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"58c531dc5a48cd62f9349748a6009c23b2832a8a","unresolved":false,"context_lines":[{"line_number":113,"context_line":"This will be at the **unversioned** root of a service e.g. "},{"line_number":114,"context_line":"``https://openstack.example.com/dns/healthcheck/v1``."},{"line_number":115,"context_line":""},{"line_number":116,"context_line":"It will be versioned completely separately from the services API, as the "},{"line_number":117,"context_line":"version **should** be consistent across an OpenStack deployment, and will"},{"line_number":118,"context_line":"refer to the structure of the healthcheck response. "},{"line_number":119,"context_line":""}],"source_content_type":"text/x-rst","patch_set":4,"id":"5f93b717_e48730d7","line":116,"updated":"2018-01-25 12:58:03.000000000","message":"It won\u0027t. Microversioning logic is *far* to complex to try and get into a lot of healthcheck tooling.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":113,"context_line":"This will be at the **unversioned** root of a service e.g. "},{"line_number":114,"context_line":"``https://openstack.example.com/dns/healthcheck/v1``."},{"line_number":115,"context_line":""},{"line_number":116,"context_line":"It will be versioned completely separately from the services API, as the "},{"line_number":117,"context_line":"version **should** be consistent across an OpenStack deployment, and will"},{"line_number":118,"context_line":"refer to the structure of the healthcheck response. "},{"line_number":119,"context_line":""}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_4ad7eb50","line":116,"range":{"start_line":116,"start_character":0,"end_line":116,"end_character":20},"updated":"2018-01-22 20:01:47.000000000","message":"please specify that this will use microversioning.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"58c531dc5a48cd62f9349748a6009c23b2832a8a","unresolved":false,"context_lines":[{"line_number":132,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":133,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":134,"context_line":"        \"service\": \"designate-api\","},{"line_number":135,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","},{"line_number":136,"context_line":"        \"hostname\": \"api-0001-us1az1.designate.example.com\","},{"line_number":137,"context_line":"        \"created_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":138,"context_line":"        \"heartbeated_at\": \"2018-01-01T12:16:29.511Z\","}],"source_content_type":"text/x-rst","patch_set":4,"id":"5f93b717_846c2c37","line":135,"updated":"2018-01-25 12:58:03.000000000","message":"This is not the \"all of designate\" level, this is the service being called. \n\nI am going to remove the /services/ endpoint from this spec, I think it added too much complexity and scope creap.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":131,"context_line":"        \"test_failures\": 0,"},{"line_number":132,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":133,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":134,"context_line":"        \"service\": \"designate-api\","},{"line_number":135,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","},{"line_number":136,"context_line":"        \"hostname\": \"api-0001-us1az1.designate.example.com\","},{"line_number":137,"context_line":"        \"created_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":138,"context_line":"        \"heartbeated_at\": \"2018-01-01T12:16:29.511Z\","}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_0aeb23dc","line":135,"range":{"start_line":134,"start_character":0,"end_line":135,"end_character":61},"updated":"2018-01-22 20:01:47.000000000","message":"this isn\u0027t relevant at this all-of-designate level, nor is the tests block. Those should only be returned if you go to the healthcheck/v1/services/ or healthcheck/v1/services/{service_id} urls.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"821a8e37ca4f0cc25c84e5d67b74409b8d848ef8","unresolved":false,"context_lines":[{"line_number":132,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":133,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":134,"context_line":"        \"service\": \"designate-api\","},{"line_number":135,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","},{"line_number":136,"context_line":"        \"hostname\": \"api-0001-us1az1.designate.example.com\","},{"line_number":137,"context_line":"        \"created_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":138,"context_line":"        \"heartbeated_at\": \"2018-01-01T12:16:29.511Z\","}],"source_content_type":"text/x-rst","patch_set":4,"id":"5f93b717_32d6acc7","line":135,"in_reply_to":"5f93b717_846c2c37","updated":"2018-01-26 15:40:31.000000000","message":"on the contrary, I think that was important in bringing this issue to light... See continued comments in PS5","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"58c531dc5a48cd62f9349748a6009c23b2832a8a","unresolved":false,"context_lines":[{"line_number":166,"context_line":"      }"},{"line_number":167,"context_line":""},{"line_number":168,"context_line":"   :statuscode 200: Service is running correctly"},{"line_number":169,"context_line":"   :statuscode 503: Service is having issues"},{"line_number":170,"context_line":""},{"line_number":171,"context_line":".. note::"},{"line_number":172,"context_line":""}],"source_content_type":"text/x-rst","patch_set":4,"id":"5f93b717_a4692826","line":169,"updated":"2018-01-25 12:58:03.000000000","message":"You are not miss reading.\n\nA lot of healthcheck checking systems require a change of status code for something to considered broken.\n\nI am not happy with this myself, but I am not sure how we work around pre existing tooling that requires this.\n\nOn the keystone / policy question, right now, it is not behind either middleware (how can you check if the services connection to keystone is working, if you need keystone to get results).\n\nWe need to clarify in docs, that this endpoint is restricted to safe IPs (aka the container host where kublet is running / icinga servers / load balancers etc) or in the paste.ini deployers put keystone middleware infront.\n\nAgain, most healthchecking systems will not support getting a token from one place, then using it in the request for the heathcheck.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":165,"context_line":"        ]"},{"line_number":166,"context_line":"      }"},{"line_number":167,"context_line":""},{"line_number":168,"context_line":"   :statuscode 200: Service is running correctly"},{"line_number":169,"context_line":"   :statuscode 503: Service is having issues"},{"line_number":170,"context_line":""},{"line_number":171,"context_line":".. note::"},{"line_number":172,"context_line":""}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_5fb7af87","line":169,"range":{"start_line":168,"start_character":0,"end_line":169,"end_character":44},"updated":"2018-01-22 20:01:47.000000000","message":"it sounds like you\u0027re saying 200 \u003d HEALTHY and 503 \u003d UNHEALTHY, but I hope that\u0027s just me misreading. Should always return 200 unless there\u0027s a bug in the middleware leading to a 5xx or an auth problem leading to 401 or 403.\n\nSpeaking of which, we should make it explicit somewhere in this doc that these URLs will require a valid keystone token and be checked with oslo_policy, even though they\u0027re on the unversioned base URL.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":170,"context_line":""},{"line_number":171,"context_line":".. note::"},{"line_number":172,"context_line":""},{"line_number":173,"context_line":"   Currently, this middleware will only work for services that expose an API,"},{"line_number":174,"context_line":"   which only covers a small section of OpenStack services - many of them do"},{"line_number":175,"context_line":"   not - e.g.:"},{"line_number":176,"context_line":""}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_8a43b313","line":173,"updated":"2018-01-22 20:01:47.000000000","message":"Currently it doesn\u0027t exist, so this needs rewording. And we need to account for this from day one. Even if we decide to implement tests for non-API services in a later stage, the design should allow for surfacing results from non-API services via an API service, so that we have minimal (if any) changes to the healthcheck APIs when that is added.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"58c531dc5a48cd62f9349748a6009c23b2832a8a","unresolved":false,"context_lines":[{"line_number":170,"context_line":""},{"line_number":171,"context_line":".. note::"},{"line_number":172,"context_line":""},{"line_number":173,"context_line":"   Currently, this middleware will only work for services that expose an API,"},{"line_number":174,"context_line":"   which only covers a small section of OpenStack services - many of them do"},{"line_number":175,"context_line":"   not - e.g.:"},{"line_number":176,"context_line":""}],"source_content_type":"text/x-rst","patch_set":4,"id":"5f93b717_44762409","line":173,"updated":"2018-01-25 12:58:03.000000000","message":"This is what is laid out below.\n\nIdeally, we would add http interfaces on *all* services, so we wouldn\u0027t need the system laid out below at all.\n\nI am actually going to remove the stuff below, I think it is confusing the issue.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"821a8e37ca4f0cc25c84e5d67b74409b8d848ef8","unresolved":false,"context_lines":[{"line_number":170,"context_line":""},{"line_number":171,"context_line":".. note::"},{"line_number":172,"context_line":""},{"line_number":173,"context_line":"   Currently, this middleware will only work for services that expose an API,"},{"line_number":174,"context_line":"   which only covers a small section of OpenStack services - many of them do"},{"line_number":175,"context_line":"   not - e.g.:"},{"line_number":176,"context_line":""}],"source_content_type":"text/x-rst","patch_set":4,"id":"5f93b717_7b3e3fdb","line":173,"in_reply_to":"5f93b717_44762409","updated":"2018-01-26 15:40:31.000000000","message":"I think you\u0027ll find a lot of people disagree with that being \"ideal\"...","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":21414,"name":"Yujun Zhang","email":"zhang.yujunz@zte.com.cn","username":"yujunz"},"change_message_id":"4eb62fe1f63fc2e2e1d03bd127723dde46e326fc","unresolved":false,"context_lines":[{"line_number":209,"context_line":"            \"service_id\": \"a86dba58-0043-4cc6-a1bb-69d5e86f3ca3\","},{"line_number":210,"context_line":"            \"hostname\": \"central-0001-us1az1.designate.example.com\","},{"line_number":211,"context_line":"            \"created_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":212,"context_line":"            \"heartbeated_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":213,"context_line":"            \"links\": ["},{"line_number":214,"context_line":"                {"},{"line_number":215,"context_line":"                  \"rel\": \"self\","}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_d49a87c3","line":212,"range":{"start_line":212,"start_character":13,"end_line":212,"end_character":27},"updated":"2018-01-24 07:43:19.000000000","message":"heartbeaten_at","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"71a4696c4c077b25e415a9e070335a45be5642dd","unresolved":false,"context_lines":[{"line_number":210,"context_line":"            \"hostname\": \"central-0001-us1az1.designate.example.com\","},{"line_number":211,"context_line":"            \"created_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":212,"context_line":"            \"heartbeated_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":213,"context_line":"            \"links\": ["},{"line_number":214,"context_line":"                {"},{"line_number":215,"context_line":"                  \"rel\": \"self\","},{"line_number":216,"context_line":"                  \"href\": \"https://127.0.0.1/dns/healthcheck/v1/services/a86dba58-0043-4cc6-a1bb-69d5e86f3ca3\""}],"source_content_type":"text/x-rst","patch_set":4,"id":"7f96bb07_0aeec3b3","line":213,"updated":"2018-01-22 20:01:47.000000000","message":"should have a tests section for each of these services as well as for the API service.","commit_id":"0320f58fc425296ce12889609d4bff0d37a2fea7"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"821a8e37ca4f0cc25c84e5d67b74409b8d848ef8","unresolved":false,"context_lines":[{"line_number":89,"context_line":"   This will probably be a bikeshed - but can be changed as we move on"},{"line_number":90,"context_line":"   with development"},{"line_number":91,"context_line":""},{"line_number":92,"context_line":"I see (initially) 3 statuses:"},{"line_number":93,"context_line":""},{"line_number":94,"context_line":"* HEALTHY"},{"line_number":95,"context_line":"* UNHEALTHY"}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_5b6d8302","line":92,"range":{"start_line":92,"start_character":20,"end_line":92,"end_character":28},"updated":"2018-01-26 15:40:31.000000000","message":"I\u0027d like to see at least one more state. If the test _knows_ things are HEALTHY it can use that, or if it _knows_ things are UNHEALTHY it can use that, but it should also be possible to write tests to return an indeterminate status (WARNING?) to flag that there _may_ be a problem and some investigation would be warranted.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"61510052649dedfcd60d7e88317ec880da095a8a","unresolved":false,"context_lines":[{"line_number":89,"context_line":"   This will probably be a bikeshed - but can be changed as we move on"},{"line_number":90,"context_line":"   with development"},{"line_number":91,"context_line":""},{"line_number":92,"context_line":"I see (initially) 3 statuses:"},{"line_number":93,"context_line":""},{"line_number":94,"context_line":"* HEALTHY"},{"line_number":95,"context_line":"* UNHEALTHY"}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_edb583ce","line":92,"updated":"2018-01-29 14:41:18.000000000","message":"Personally, I would be of the opinion that if a service may be unhealthy, it should be pulled out of rotation / load balancer pool / use / etc  anyway","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"259a565e1f02c8e0dc0d71177018852cff6a38c1","unresolved":false,"context_lines":[{"line_number":89,"context_line":"   This will probably be a bikeshed - but can be changed as we move on"},{"line_number":90,"context_line":"   with development"},{"line_number":91,"context_line":""},{"line_number":92,"context_line":"I see (initially) 3 statuses:"},{"line_number":93,"context_line":""},{"line_number":94,"context_line":"* HEALTHY"},{"line_number":95,"context_line":"* UNHEALTHY"}],"source_content_type":"text/x-rst","patch_set":5,"id":"df7087c5_60aaffd2","line":92,"in_reply_to":"5f93b717_edb583ce","updated":"2018-03-09 16:05:19.000000000","message":"Sometimes, but certainly not always. Some services can\u0027t be pulled out of use (e.g. nova). Sometimes admins will want to investigate before taking such a drastic action. Etc. Surface the information so the caller can make that decision.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"821a8e37ca4f0cc25c84e5d67b74409b8d848ef8","unresolved":false,"context_lines":[{"line_number":120,"context_line":"    \"message\": \"Service is connecting to the queue normally\""},{"line_number":121,"context_line":"  }"},{"line_number":122,"context_line":""},{"line_number":123,"context_line":"The test should also define a ``timeout`` variable that indicates the oldest"},{"line_number":124,"context_line":"\"HEALTHY\" result, after which time, the test result will be considered"},{"line_number":125,"context_line":"\"UNHEALTHY\"."},{"line_number":126,"context_line":""},{"line_number":127,"context_line":""},{"line_number":128,"context_line":"This is used when looking up the results when an API call arrives, so the #"}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_1b6dbb31","line":125,"range":{"start_line":123,"start_character":56,"end_line":125,"end_character":12},"updated":"2018-01-26 15:40:31.000000000","message":"A timestamp would indicate the oldest healthy result, not a timeout. I think what you mean is that this variable \"dictates how old the most recent test result can be before the API will report the test as unhealthy\" or something like that.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"61510052649dedfcd60d7e88317ec880da095a8a","unresolved":false,"context_lines":[{"line_number":122,"context_line":""},{"line_number":123,"context_line":"The test should also define a ``timeout`` variable that indicates the oldest"},{"line_number":124,"context_line":"\"HEALTHY\" result, after which time, the test result will be considered"},{"line_number":125,"context_line":"\"UNHEALTHY\"."},{"line_number":126,"context_line":""},{"line_number":127,"context_line":""},{"line_number":128,"context_line":"This is used when looking up the results when an API call arrives, so the #"}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_8dccff6a","line":125,"updated":"2018-01-29 14:41:18.000000000","message":"Yeah, I thought the section below cleared this up, but I can re word.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"61510052649dedfcd60d7e88317ec880da095a8a","unresolved":false,"context_lines":[{"line_number":146,"context_line":"results."},{"line_number":147,"context_line":""},{"line_number":148,"context_line":"This will be at the **unversioned** root of a service e.g."},{"line_number":149,"context_line":"``https://127.0.0.1:9001/healthcheck/v1``. It exposes the health of the process"},{"line_number":150,"context_line":"being called, so it should not be used for overall service healthchecking."},{"line_number":151,"context_line":""},{"line_number":152,"context_line":"It will be versioned completely separately from the services API, as the"}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_adc77b45","line":149,"updated":"2018-01-29 14:41:18.000000000","message":"Sure, I can add JSON Home support as a part of the spec","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"821a8e37ca4f0cc25c84e5d67b74409b8d848ef8","unresolved":false,"context_lines":[{"line_number":146,"context_line":"results."},{"line_number":147,"context_line":""},{"line_number":148,"context_line":"This will be at the **unversioned** root of a service e.g."},{"line_number":149,"context_line":"``https://127.0.0.1:9001/healthcheck/v1``. It exposes the health of the process"},{"line_number":150,"context_line":"being called, so it should not be used for overall service healthchecking."},{"line_number":151,"context_line":""},{"line_number":152,"context_line":"It will be versioned completely separately from the services API, as the"}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_feb31d4c","line":149,"range":{"start_line":149,"start_character":25,"end_line":149,"end_character":36},"updated":"2018-01-26 15:40:31.000000000","message":"irrespective of whether this uses microversioning, we need to be able to do version discovery on /healthcheck. Please add that to the spec.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"61510052649dedfcd60d7e88317ec880da095a8a","unresolved":false,"context_lines":[{"line_number":149,"context_line":"``https://127.0.0.1:9001/healthcheck/v1``. It exposes the health of the process"},{"line_number":150,"context_line":"being called, so it should not be used for overall service healthchecking."},{"line_number":151,"context_line":""},{"line_number":152,"context_line":"It will be versioned completely separately from the services API, as the"},{"line_number":153,"context_line":"version **should** be consistent across an OpenStack deployment, and will"},{"line_number":154,"context_line":"refer to the structure of the healthcheck response."},{"line_number":155,"context_line":""}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_4dc6f74a","line":152,"updated":"2018-01-29 14:41:18.000000000","message":"Ack","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"821a8e37ca4f0cc25c84e5d67b74409b8d848ef8","unresolved":false,"context_lines":[{"line_number":149,"context_line":"``https://127.0.0.1:9001/healthcheck/v1``. It exposes the health of the process"},{"line_number":150,"context_line":"being called, so it should not be used for overall service healthchecking."},{"line_number":151,"context_line":""},{"line_number":152,"context_line":"It will be versioned completely separately from the services API, as the"},{"line_number":153,"context_line":"version **should** be consistent across an OpenStack deployment, and will"},{"line_number":154,"context_line":"refer to the structure of the healthcheck response."},{"line_number":155,"context_line":""}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_bbf36773","line":152,"range":{"start_line":152,"start_character":11,"end_line":152,"end_character":20},"updated":"2018-01-26 15:40:31.000000000","message":"you responded in the last PS that this will *not* be microversioned. Please call that out in the spec. That is an important point that everyone reviewing this needs to be clear on.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"821a8e37ca4f0cc25c84e5d67b74409b8d848ef8","unresolved":false,"context_lines":[{"line_number":150,"context_line":"being called, so it should not be used for overall service healthchecking."},{"line_number":151,"context_line":""},{"line_number":152,"context_line":"It will be versioned completely separately from the services API, as the"},{"line_number":153,"context_line":"version **should** be consistent across an OpenStack deployment, and will"},{"line_number":154,"context_line":"refer to the structure of the healthcheck response."},{"line_number":155,"context_line":""},{"line_number":156,"context_line":".. http:get:: /healthcheck/v1"}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_fe5ebd10","line":153,"range":{"start_line":153,"start_character":22,"end_line":153,"end_character":63},"updated":"2018-01-26 15:40:31.000000000","message":"I think you mean that a given version should work consistently across *all* OpenStack deployments, but it will of course be possible that one service is currently using a different version of the middleware than another service within any single deployment, particularly during rolling upgrades. Please clarify the spec on this.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"61510052649dedfcd60d7e88317ec880da095a8a","unresolved":false,"context_lines":[{"line_number":150,"context_line":"being called, so it should not be used for overall service healthchecking."},{"line_number":151,"context_line":""},{"line_number":152,"context_line":"It will be versioned completely separately from the services API, as the"},{"line_number":153,"context_line":"version **should** be consistent across an OpenStack deployment, and will"},{"line_number":154,"context_line":"refer to the structure of the healthcheck response."},{"line_number":155,"context_line":""},{"line_number":156,"context_line":".. http:get:: /healthcheck/v1"}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_6dc1732e","line":153,"updated":"2018-01-29 14:41:18.000000000","message":"Yup - will clarify","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"821a8e37ca4f0cc25c84e5d67b74409b8d848ef8","unresolved":false,"context_lines":[{"line_number":167,"context_line":"        \"test_failures\": 0,"},{"line_number":168,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":169,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":170,"context_line":"        \"service\": \"designate-api\","},{"line_number":171,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","},{"line_number":172,"context_line":"        \"hostname\": \"api-0001-us1az1.designate.example.com\","},{"line_number":173,"context_line":"        \"created_at\": \"2018-01-01T12:16:29.511Z\","}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_db07531c","line":170,"updated":"2018-01-26 15:40:31.000000000","message":"The API for a service (e.g. nova) represents the whole service. Removing /services from this spec doesn\u0027t change the fact that the current design is not readily expandable to include test results for other processes within the service (e.g. nova-conductor). See comment in PS4. If we go forward with the design as-is, we will have to totally redesign it when we want to expand to cover those pieces in the future, so that the root URL is no longer specific to only the API portion (e.g. nova-api). Let\u0027s please not set ourselves up for an incompatible version bump in the near future.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"61510052649dedfcd60d7e88317ec880da095a8a","unresolved":false,"context_lines":[{"line_number":167,"context_line":"        \"test_failures\": 0,"},{"line_number":168,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":169,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":170,"context_line":"        \"service\": \"designate-api\","},{"line_number":171,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","},{"line_number":172,"context_line":"        \"hostname\": \"api-0001-us1az1.designate.example.com\","},{"line_number":173,"context_line":"        \"created_at\": \"2018-01-01T12:16:29.511Z\","}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_0ddeaf10","line":170,"updated":"2018-01-29 14:41:18.000000000","message":"The root will never be the whole service.\n\nAt somepoint we may add back the /services and with that a /services/overview (or other wording)\n\nThe old /services was designed (in my head anyway) as a temporary solution to services that do not have a HTTP service, with the long term goal of adding this to them.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"239b75f4ad86b5fc1f2bb5dd5c08110c05e9323a","unresolved":false,"context_lines":[{"line_number":167,"context_line":"        \"test_failures\": 0,"},{"line_number":168,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":169,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":170,"context_line":"        \"service\": \"designate-api\","},{"line_number":171,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","},{"line_number":172,"context_line":"        \"hostname\": \"api-0001-us1az1.designate.example.com\","},{"line_number":173,"context_line":"        \"created_at\": \"2018-01-01T12:16:29.511Z\","}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_1a033aba","line":170,"in_reply_to":"5f93b717_0ddeaf10","updated":"2018-01-29 22:58:49.000000000","message":"I can appreciate the thought, but I just don\u0027t see us ever getting to where every subservice has its own API endpoint. Unless you have had conversations with a whole bunch of key stakeholders and they\u0027re all shaking their head yes on that... don\u0027t bet on it.\n\nI\u0027m not entirely sure what you meant by \"The root will never be the whole service\". I\u0027m not suggesting that /healthcheck/v1 has to give an overview status. It could just not give status at all. But if it gives status, then it needs to be an overview, because that path covers the whole of nova/cinder/etc... not just nova-api/cinder-api/etc.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"61510052649dedfcd60d7e88317ec880da095a8a","unresolved":false,"context_lines":[{"line_number":202,"context_line":"      }"},{"line_number":203,"context_line":""},{"line_number":204,"context_line":"   :statuscode 200: Service is running correctly"},{"line_number":205,"context_line":"   :statuscode 503: Service is having issues"},{"line_number":206,"context_line":""},{"line_number":207,"context_line":".. note::"},{"line_number":208,"context_line":""}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_2de12bd1","line":205,"updated":"2018-01-29 14:41:18.000000000","message":"It is an *extremely* common pattern across reverse proxies like nginx / haproxy, co-ordination systems like kubernetes, testing services like pingdom, and load balancers like f5.\n\nI really do not want to write a tool that requires a shim in place for a lot of the main users of the tool.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"821a8e37ca4f0cc25c84e5d67b74409b8d848ef8","unresolved":false,"context_lines":[{"line_number":201,"context_line":"        ]"},{"line_number":202,"context_line":"      }"},{"line_number":203,"context_line":""},{"line_number":204,"context_line":"   :statuscode 200: Service is running correctly"},{"line_number":205,"context_line":"   :statuscode 503: Service is having issues"},{"line_number":206,"context_line":""},{"line_number":207,"context_line":".. note::"},{"line_number":208,"context_line":""}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_1ed0491f","line":205,"range":{"start_line":204,"start_character":0,"end_line":205,"end_character":44},"updated":"2018-01-26 15:40:31.000000000","message":"this is in violation of the API Guidelines. See https://specs.openstack.org/openstack/api-wg/guidelines/http.html#http-response-codes\n\nIf there are systems that require a change of status code to consider something broken, they could use a shim that sits between them and the OpenStack API to change status codes to their liking. But such systems will not be the only callers of this API and we shouldn\u0027t compromise our standards and make everyone suffer for their poor design.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"239b75f4ad86b5fc1f2bb5dd5c08110c05e9323a","unresolved":false,"context_lines":[{"line_number":202,"context_line":"      }"},{"line_number":203,"context_line":""},{"line_number":204,"context_line":"   :statuscode 200: Service is running correctly"},{"line_number":205,"context_line":"   :statuscode 503: Service is having issues"},{"line_number":206,"context_line":""},{"line_number":207,"context_line":".. note::"},{"line_number":208,"context_line":""}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_dabcf2dd","line":205,"in_reply_to":"5f93b717_2de12bd1","updated":"2018-01-29 22:58:49.000000000","message":"I hear you. There\u0027s no great answer here. If you can get the API SIG to go along with this then fine. Good luck with that...","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":1063,"name":"Ed Leafe","email":"ed@leafe.com","username":"ed-leafe"},"change_message_id":"d6b59516c069321758bcfe31106b940c9672c125","unresolved":false,"context_lines":[{"line_number":202,"context_line":"      }"},{"line_number":203,"context_line":""},{"line_number":204,"context_line":"   :statuscode 200: Service is running correctly"},{"line_number":205,"context_line":"   :statuscode 503: Service is having issues"},{"line_number":206,"context_line":""},{"line_number":207,"context_line":".. note::"},{"line_number":208,"context_line":""}],"source_content_type":"text/x-rst","patch_set":5,"id":"3fa0c359_713f9e86","line":205,"in_reply_to":"5f93b717_a5f8d97a","updated":"2018-02-13 15:33:48.000000000","message":"Yeah, 503 really doesn\u0027t seem right for this. It\u0027s an expected response, not an error. 200 with UNHEALTHY in the body would be the way to go. If the desire is to have a quick way of determining ack/nack for health, the body could contain such a field, so that the entire response need not be examined.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"24ae155055543957bddc6bc23d53ead805921a73","unresolved":false,"context_lines":[{"line_number":202,"context_line":"      }"},{"line_number":203,"context_line":""},{"line_number":204,"context_line":"   :statuscode 200: Service is running correctly"},{"line_number":205,"context_line":"   :statuscode 503: Service is having issues"},{"line_number":206,"context_line":""},{"line_number":207,"context_line":".. note::"},{"line_number":208,"context_line":""}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_a5f8d97a","line":205,"in_reply_to":"5f93b717_b5b9f9e9","updated":"2018-01-30 12:24:34.000000000","message":"No. As I understand it (see comments in previous patch set) this is saying the proposed API will return 503 anytime one of the tests will show UNHEALTHY in the response body, for any reason.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":11564,"name":"Chris Dent","email":"cdent@anticdent.org","username":"chdent"},"change_message_id":"ea0420b174ef475b45127873e68c704e82b92f07","unresolved":false,"context_lines":[{"line_number":202,"context_line":"      }"},{"line_number":203,"context_line":""},{"line_number":204,"context_line":"   :statuscode 200: Service is running correctly"},{"line_number":205,"context_line":"   :statuscode 503: Service is having issues"},{"line_number":206,"context_line":""},{"line_number":207,"context_line":".. note::"},{"line_number":208,"context_line":""}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_b5b9f9e9","line":205,"in_reply_to":"5f93b717_dabcf2dd","updated":"2018-01-29 23:54:24.000000000","message":"I\u0027m confused about what 503 is meant to mean here? 503 would be correct if the server answering the URI is unable to reach the application which actually services the URI. So it makes sense when a reverse proxy can\u0027t talk to something, because of e.g. load. Yes, it is \"extremely common pattern\" in those scenarios.\n\nThat\u0027s not what\u0027s being described here, is it?","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"821a8e37ca4f0cc25c84e5d67b74409b8d848ef8","unresolved":false,"context_lines":[{"line_number":206,"context_line":""},{"line_number":207,"context_line":".. note::"},{"line_number":208,"context_line":""},{"line_number":209,"context_line":"   Currently, this middleware will only work for services that expose an API,"},{"line_number":210,"context_line":"   which only covers a small section of OpenStack services - many of them do"},{"line_number":211,"context_line":"   not - e.g.:"},{"line_number":212,"context_line":""}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_7b495f89","line":209,"range":{"start_line":209,"start_character":3,"end_line":209,"end_character":12},"updated":"2018-01-26 15:40:31.000000000","message":"\"Initially\". It doesn\u0027t exist yet, so there is no \"currently\".","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"61510052649dedfcd60d7e88317ec880da095a8a","unresolved":false,"context_lines":[{"line_number":206,"context_line":""},{"line_number":207,"context_line":".. note::"},{"line_number":208,"context_line":""},{"line_number":209,"context_line":"   Currently, this middleware will only work for services that expose an API,"},{"line_number":210,"context_line":"   which only covers a small section of OpenStack services - many of them do"},{"line_number":211,"context_line":"   not - e.g.:"},{"line_number":212,"context_line":""}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_cde767e4","line":209,"updated":"2018-01-29 14:41:18.000000000","message":"Will update to \"Initially\"","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"61510052649dedfcd60d7e88317ec880da095a8a","unresolved":false,"context_lines":[{"line_number":216,"context_line":"   * many many more."},{"line_number":217,"context_line":""},{"line_number":218,"context_line":"   These services should be extended to expose a HTTP service, which runs this"},{"line_number":219,"context_line":"   middleware."},{"line_number":220,"context_line":""},{"line_number":221,"context_line":"Reviewer activity"},{"line_number":222,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_edea63aa","line":219,"updated":"2018-01-29 14:41:18.000000000","message":"Ack","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"821a8e37ca4f0cc25c84e5d67b74409b8d848ef8","unresolved":false,"context_lines":[{"line_number":216,"context_line":"   * many many more."},{"line_number":217,"context_line":""},{"line_number":218,"context_line":"   These services should be extended to expose a HTTP service, which runs this"},{"line_number":219,"context_line":"   middleware."},{"line_number":220,"context_line":""},{"line_number":221,"context_line":"Reviewer activity"},{"line_number":222,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":5,"id":"5f93b717_49e29903","line":219,"range":{"start_line":219,"start_character":3,"end_line":219,"end_character":13},"updated":"2018-01-26 15:40:31.000000000","message":"referring back to discussion in PS4, if you don\u0027t have this behind the authtoken middleware and doing a policy check, then this is insecure. That\u0027s a major design point with serious implications. Please call that out in the spec so everyone sees this and we can hammer out agreement.","commit_id":"1fbe4e59b96b9ffac1a6e8c5bf6caa19deb71825"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"2235da312f59aed480fa4bedc0a52298c4ae236b","unresolved":false,"context_lines":[{"line_number":46,"context_line":""},{"line_number":47,"context_line":"Initially this will be worked through using Designate, and as the model"},{"line_number":48,"context_line":"solidifies we will start encouraging other projects to use it."},{"line_number":49,"context_line":""},{"line_number":50,"context_line":"Initial Design"},{"line_number":51,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":52,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"3fa0c359_7699b0ce","line":49,"updated":"2018-02-14 17:47:01.000000000","message":"Sure - we can add a section, but the 2 middlewares will work side by side, so it will be a case of just running the 2 for a cycle or two","commit_id":"db99c382430abef8010cdfcc986756b3b7ba4d35"},{"author":{"_account_id":6618,"name":"Ruby Loo","email":"opensrloo@gmail.com","username":"rloo"},"change_message_id":"b6e99ced500f2ae07fea42ff3bcd1b89e8551fe6","unresolved":false,"context_lines":[{"line_number":46,"context_line":""},{"line_number":47,"context_line":"Initially this will be worked through using Designate, and as the model"},{"line_number":48,"context_line":"solidifies we will start encouraging other projects to use it."},{"line_number":49,"context_line":""},{"line_number":50,"context_line":"Initial Design"},{"line_number":51,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":52,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"3fa0c359_15b41bd0","line":49,"updated":"2018-02-12 23:09:27.000000000","message":"ironic was *just* about to use the existing healthcheck middleware mentioned above at L38. (Well, we\u0027re working on a code patch to add that in.)\n\nWould it be fair to ask you to add something about how a project should gracefully deprecate the use of the old healthcheck and adopt this one, so that users will still love us?","commit_id":"db99c382430abef8010cdfcc986756b3b7ba4d35"},{"author":{"_account_id":10343,"name":"Jim Rollenhagen","email":"jim@jimrollenhagen.com","username":"jimrollenhagen"},"change_message_id":"3b217a828f03fba0c1037ab4dd6f8c9b479af8df","unresolved":false,"context_lines":[{"line_number":46,"context_line":""},{"line_number":47,"context_line":"Initially this will be worked through using Designate, and as the model"},{"line_number":48,"context_line":"solidifies we will start encouraging other projects to use it."},{"line_number":49,"context_line":""},{"line_number":50,"context_line":"Initial Design"},{"line_number":51,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":52,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"3fa0c359_5ce052fb","line":49,"in_reply_to":"3fa0c359_0626fb17","updated":"2018-02-14 21:07:32.000000000","message":"rloo: you are correct :)","commit_id":"db99c382430abef8010cdfcc986756b3b7ba4d35"},{"author":{"_account_id":6618,"name":"Ruby Loo","email":"opensrloo@gmail.com","username":"rloo"},"change_message_id":"7317fcf03aa2e0dd7875bdd80964061f4aaa1a5f","unresolved":false,"context_lines":[{"line_number":46,"context_line":""},{"line_number":47,"context_line":"Initially this will be worked through using Designate, and as the model"},{"line_number":48,"context_line":"solidifies we will start encouraging other projects to use it."},{"line_number":49,"context_line":""},{"line_number":50,"context_line":"Initial Design"},{"line_number":51,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":52,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"3fa0c359_0626fb17","line":49,"in_reply_to":"3fa0c359_7699b0ce","updated":"2018-02-14 20:02:12.000000000","message":"So we run the two middlewares, and they can live side by side together because they have different endpoints?\n\nexisting healthcheck: If I understand (based on [1]), it\u0027d be HEAD /healthcheck or GET /healthcheck\nthis proposal: GET /healthcheck/v1 and in the future GET /healthcheck/vX ?\n\nIf yes, then all is good :)\n\n[1] https://docs.openstack.org/oslo.middleware/latest/reference/healthcheck_plugins.html","commit_id":"db99c382430abef8010cdfcc986756b3b7ba4d35"},{"author":{"_account_id":2394,"name":"Adam Spiers","email":"aspiers@suse.com","username":"adam.spiers"},"change_message_id":"26945e8bd3f72459b4e3af2d61e3782935e92714","unresolved":false,"context_lines":[{"line_number":160,"context_line":""},{"line_number":161,"context_line":"It will be versioned completely separately from the services API, as the"},{"line_number":162,"context_line":"version **should** be consistent across an OpenStack deployment, and will"},{"line_number":163,"context_line":"refer to the structure of the healthcheck response. The versions availible may"},{"line_number":164,"context_line":"be in flux as upgrades happen, but it should be possible to have a consistent"},{"line_number":165,"context_line":"version across services."},{"line_number":166,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"3fa0c359_dcf6add9","line":163,"range":{"start_line":163,"start_character":65,"end_line":163,"end_character":74},"updated":"2018-02-14 15:48:01.000000000","message":"*available","commit_id":"db99c382430abef8010cdfcc986756b3b7ba4d35"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"2235da312f59aed480fa4bedc0a52298c4ae236b","unresolved":false,"context_lines":[{"line_number":160,"context_line":""},{"line_number":161,"context_line":"It will be versioned completely separately from the services API, as the"},{"line_number":162,"context_line":"version **should** be consistent across an OpenStack deployment, and will"},{"line_number":163,"context_line":"refer to the structure of the healthcheck response. The versions availible may"},{"line_number":164,"context_line":"be in flux as upgrades happen, but it should be possible to have a consistent"},{"line_number":165,"context_line":"version across services."},{"line_number":166,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"3fa0c359_969ebcb6","line":163,"updated":"2018-02-14 17:47:01.000000000","message":"Ack","commit_id":"db99c382430abef8010cdfcc986756b3b7ba4d35"},{"author":{"_account_id":2394,"name":"Adam Spiers","email":"aspiers@suse.com","username":"adam.spiers"},"change_message_id":"26945e8bd3f72459b4e3af2d61e3782935e92714","unresolved":false,"context_lines":[{"line_number":166,"context_line":""},{"line_number":167,"context_line":"This API **should not** use microversions. The majority of tools that will"},{"line_number":168,"context_line":"consume this API will not have the ability to use microversions, which negates"},{"line_number":169,"context_line":"any benifit to using microversions."},{"line_number":170,"context_line":""},{"line_number":171,"context_line":"The sample below represents the health of a single service process - in this"},{"line_number":172,"context_line":"case a ``designate-api`` process."}],"source_content_type":"text/x-rst","patch_set":6,"id":"3fa0c359_bcfde1b2","line":169,"range":{"start_line":169,"start_character":4,"end_line":169,"end_character":11},"updated":"2018-02-14 15:48:01.000000000","message":"*benefit","commit_id":"db99c382430abef8010cdfcc986756b3b7ba4d35"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"2235da312f59aed480fa4bedc0a52298c4ae236b","unresolved":false,"context_lines":[{"line_number":166,"context_line":""},{"line_number":167,"context_line":"This API **should not** use microversions. The majority of tools that will"},{"line_number":168,"context_line":"consume this API will not have the ability to use microversions, which negates"},{"line_number":169,"context_line":"any benifit to using microversions."},{"line_number":170,"context_line":""},{"line_number":171,"context_line":"The sample below represents the health of a single service process - in this"},{"line_number":172,"context_line":"case a ``designate-api`` process."}],"source_content_type":"text/x-rst","patch_set":6,"id":"3fa0c359_36a32803","line":169,"updated":"2018-02-14 17:47:01.000000000","message":"Ack","commit_id":"db99c382430abef8010cdfcc986756b3b7ba4d35"},{"author":{"_account_id":1063,"name":"Ed Leafe","email":"ed@leafe.com","username":"ed-leafe"},"change_message_id":"99968654db331911a3ef6291f2835c662c7ddac2","unresolved":false,"context_lines":[{"line_number":227,"context_line":""},{"line_number":228,"context_line":"The suggested status codes are in violation of the current API Guidelines."},{"line_number":229,"context_line":"However, the large majority of services that will use this endpoint work on a"},{"line_number":230,"context_line":"very basic set of logic, where ``status code`` \u003e 400 \u003d\u003d good, and \u003c 400 \u003d\u003d bad."},{"line_number":231,"context_line":""},{"line_number":232,"context_line":"E.g. `Kubernetes Liveness Probes`_ or `HAProxy HTTP Checks`_ either use the raw"},{"line_number":233,"context_line":"status code, or a regex on the body."}],"source_content_type":"text/x-rst","patch_set":6,"id":"3fa0c359_6f29f634","line":230,"range":{"start_line":230,"start_character":31,"end_line":230,"end_character":79},"updated":"2018-02-13 18:42:35.000000000","message":"A bit confusing. Generally, \u003e\u003d400 is considered bad, no?","commit_id":"db99c382430abef8010cdfcc986756b3b7ba4d35"},{"author":{"_account_id":8099,"name":"Graham Hayes","email":"gr@ham.ie","username":"graham"},"change_message_id":"2235da312f59aed480fa4bedc0a52298c4ae236b","unresolved":false,"context_lines":[{"line_number":227,"context_line":""},{"line_number":228,"context_line":"The suggested status codes are in violation of the current API Guidelines."},{"line_number":229,"context_line":"However, the large majority of services that will use this endpoint work on a"},{"line_number":230,"context_line":"very basic set of logic, where ``status code`` \u003e 400 \u003d\u003d good, and \u003c 400 \u003d\u003d bad."},{"line_number":231,"context_line":""},{"line_number":232,"context_line":"E.g. `Kubernetes Liveness Probes`_ or `HAProxy HTTP Checks`_ either use the raw"},{"line_number":233,"context_line":"status code, or a regex on the body."}],"source_content_type":"text/x-rst","patch_set":6,"id":"3fa0c359_56a834da","line":230,"updated":"2018-02-14 17:47:01.000000000","message":"Yes, yes it does - I managed to swap them, even after re-readign the section twice -_-","commit_id":"db99c382430abef8010cdfcc986756b3b7ba4d35"},{"author":{"_account_id":11564,"name":"Chris Dent","email":"cdent@anticdent.org","username":"chdent"},"change_message_id":"1da1ac9d4eb2b6b96ceb6d7007e01990a5399a19","unresolved":false,"context_lines":[{"line_number":4,"context_line":""},{"line_number":5,"context_line":"Currently in OpenStack there is no standard way to determine the health of a"},{"line_number":6,"context_line":"service. Many projects have a \"status\" or \"agents\" API that can list"},{"line_number":7,"context_line":"rudimentary \"health\" parameters."},{"line_number":8,"context_line":""},{"line_number":9,"context_line":"Having a common piece of opinionated middleware that can be shared across"},{"line_number":10,"context_line":"projects means that operators only have to produce one set of tooling to poll"}],"source_content_type":"text/x-rst","patch_set":7,"id":"3fa0c359_d9f7028b","line":7,"updated":"2018-02-21 18:18:10.000000000","message":"You mention the existing healthcheck middleware below, how this new thing is different from that might be sensible to state as quickly as possible.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"259a565e1f02c8e0dc0d71177018852cff6a38c1","unresolved":false,"context_lines":[{"line_number":16,"context_line":"Provide a way for a project to expose healthcheck metrics, in a simple"},{"line_number":17,"context_line":"standard API."},{"line_number":18,"context_line":""},{"line_number":19,"context_line":"This would allow for orchestrators to see the results of deploying OpenStack"},{"line_number":20,"context_line":"components (for example a kubernetes pod which exposes this API would allow"},{"line_number":21,"context_line":"users to use native kubernetes liveness checks to see if a pod started"},{"line_number":22,"context_line":"correctly), allow monitoring tooling to be more precise about failures"}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_4d6db430","line":19,"range":{"start_line":19,"start_character":5,"end_line":19,"end_character":16},"updated":"2018-03-09 16:05:19.000000000","message":"This talks about orchestrators and other software tooling, but not actual end users. I would love to call these APIs as a user, to see how things are going. Either after something didn\u0027t work, or proactively before I notice any issues. Having tools that automate that is great, but I want to see this information in Horizon or a custom GUI whenever I want, regardless of tooling.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":8482,"name":"Colleen Murphy","email":"colleen@gazlene.net","username":"krinkle"},"change_message_id":"289973524ae9a4a4cec7a521961ef9e5ce4885bb","unresolved":false,"context_lines":[{"line_number":16,"context_line":"Provide a way for a project to expose healthcheck metrics, in a simple"},{"line_number":17,"context_line":"standard API."},{"line_number":18,"context_line":""},{"line_number":19,"context_line":"This would allow for orchestrators to see the results of deploying OpenStack"},{"line_number":20,"context_line":"components (for example a kubernetes pod which exposes this API would allow"},{"line_number":21,"context_line":"users to use native kubernetes liveness checks to see if a pod started"},{"line_number":22,"context_line":"correctly), allow monitoring tooling to be more precise about failures"}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_13e5d138","line":19,"range":{"start_line":19,"start_character":5,"end_line":19,"end_character":16},"in_reply_to":"df7087c5_4d6db430","updated":"2018-03-19 09:00:42.000000000","message":"That would make the auth problem a whole lot harder. Exposing this to end users does seem useful but not as an API, maybe just as an HTML page that slurps data from this API into a status dashboard in horizon.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":7769,"name":"Pentheus","display_name":"Alan Meadows","email":"alan.meadows@gmail.com","username":"alanmeadows"},"change_message_id":"952336752a983b0c7a00345aa3e71b4d3c827ec0","unresolved":false,"context_lines":[{"line_number":16,"context_line":"Provide a way for a project to expose healthcheck metrics, in a simple"},{"line_number":17,"context_line":"standard API."},{"line_number":18,"context_line":""},{"line_number":19,"context_line":"This would allow for orchestrators to see the results of deploying OpenStack"},{"line_number":20,"context_line":"components (for example a kubernetes pod which exposes this API would allow"},{"line_number":21,"context_line":"users to use native kubernetes liveness checks to see if a pod started"},{"line_number":22,"context_line":"correctly), allow monitoring tooling to be more precise about failures"}],"source_content_type":"text/x-rst","patch_set":7,"id":"3f79a3b5_6e91ac53","line":19,"range":{"start_line":19,"start_character":5,"end_line":19,"end_character":16},"in_reply_to":"df7087c5_4d6db430","updated":"2018-09-11 23:02:45.000000000","message":"To be sure, the major impetus of this whole effort is to allow orchestration to make intelligent decisions about the state of OpenStack and bubble up explicit issues so that looking at logs to see exceptions becomes a last resort, not business as usual.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":11564,"name":"Chris Dent","email":"cdent@anticdent.org","username":"chdent"},"change_message_id":"1da1ac9d4eb2b6b96ceb6d7007e01990a5399a19","unresolved":false,"context_lines":[{"line_number":22,"context_line":"correctly), allow monitoring tooling to be more precise about failures"},{"line_number":23,"context_line":"(instead of just checking if a process is running, tools like Icinga can"},{"line_number":24,"context_line":"check the service is actually running correctly), and allow tools like Vitrage"},{"line_number":25,"context_line":"and other self healing tools determine root causes of failures (if all"},{"line_number":26,"context_line":"services cannot access a database, the failure is on the database or"},{"line_number":27,"context_line":"networking side, not on the service configuration side.)"},{"line_number":28,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"3fa0c359_b93f9623","line":25,"range":{"start_line":25,"start_character":29,"end_line":25,"end_character":38},"updated":"2018-02-21 18:18:10.000000000","message":"nit: to determine","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10343,"name":"Jim Rollenhagen","email":"jim@jimrollenhagen.com","username":"jimrollenhagen"},"change_message_id":"818dc1d320f718f19e6096005e5c6e82dd2313ca","unresolved":false,"context_lines":[{"line_number":36,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":37,"context_line":""},{"line_number":38,"context_line":"The `Healthcheck`_ middleware in the current oslo.middleware repo exists, and it"},{"line_number":39,"context_line":"is currently used by Glance, Glare, Keystone, Manila, Swift and Vitrage."},{"line_number":40,"context_line":""},{"line_number":41,"context_line":"However this middleware is limited in scope, and expanding its capabilities"},{"line_number":42,"context_line":"would cause compatibility issues for current users."}],"source_content_type":"text/x-rst","patch_set":7,"id":"1f9dbf25_bec15531","line":39,"updated":"2018-02-27 12:02:26.000000000","message":"Worth adding something like \"and can be injected into most openstack services via the paste config\"?","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"259a565e1f02c8e0dc0d71177018852cff6a38c1","unresolved":false,"context_lines":[{"line_number":38,"context_line":"The `Healthcheck`_ middleware in the current oslo.middleware repo exists, and it"},{"line_number":39,"context_line":"is currently used by Glance, Glare, Keystone, Manila, Swift and Vitrage."},{"line_number":40,"context_line":""},{"line_number":41,"context_line":"However this middleware is limited in scope, and expanding its capabilities"},{"line_number":42,"context_line":"would cause compatibility issues for current users."},{"line_number":43,"context_line":""},{"line_number":44,"context_line":"Proposed adoption model/plan"},{"line_number":45,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_6d519862","line":42,"range":{"start_line":41,"start_character":27,"end_line":42,"end_character":32},"updated":"2018-03-09 16:05:19.000000000","message":"expand?","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"259a565e1f02c8e0dc0d71177018852cff6a38c1","unresolved":false,"context_lines":[{"line_number":47,"context_line":"Initially this will be worked through using Designate, and as the model"},{"line_number":48,"context_line":"solidifies we will start encouraging other projects to use it."},{"line_number":49,"context_line":""},{"line_number":50,"context_line":"For projects currently the oslo.middleware healthcheck we should allow both"},{"line_number":51,"context_line":"middlewares to be run at the same time to allow a phased migration from one"},{"line_number":52,"context_line":"to the other."},{"line_number":53,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_2de08096","line":50,"range":{"start_line":50,"start_character":13,"end_line":50,"end_character":22},"updated":"2018-03-09 16:05:19.000000000","message":"currently using","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10343,"name":"Jim Rollenhagen","email":"jim@jimrollenhagen.com","username":"jimrollenhagen"},"change_message_id":"818dc1d320f718f19e6096005e5c6e82dd2313ca","unresolved":false,"context_lines":[{"line_number":64,"context_line":"* A base test plugin, that can be used to create custom tests per service."},{"line_number":65,"context_line":""},{"line_number":66,"context_line":"Splitting the tests and the reporting allows for more in-depth tests that may"},{"line_number":67,"context_line":"not want to run on every call (to try and avoid DoS attacks)."},{"line_number":68,"context_line":""},{"line_number":69,"context_line":"A few builtin tests should be provided as well - oslo.messaging / db / keystone"},{"line_number":70,"context_line":"would be good examples."}],"source_content_type":"text/x-rst","patch_set":7,"id":"1f9dbf25_0166eeee","line":67,"updated":"2018-02-27 12:02:26.000000000","message":"Do we have any checks in mind that would be too slow to do synchronously in the healthcheck request? I\u0027m not sure we should assume we\u0027ll have them.\n\nIf a service does want to do something longer-running, it could make its own periodic tests for that (or update as needed, e.g. when foo-service realizes that etcd is unreachable, mark that test unhealthy, mark it healthy when etcd can be reached again). Then the service would have the function powering the healthcheck just fetch that result.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"259a565e1f02c8e0dc0d71177018852cff6a38c1","unresolved":false,"context_lines":[{"line_number":64,"context_line":"* A base test plugin, that can be used to create custom tests per service."},{"line_number":65,"context_line":""},{"line_number":66,"context_line":"Splitting the tests and the reporting allows for more in-depth tests that may"},{"line_number":67,"context_line":"not want to run on every call (to try and avoid DoS attacks)."},{"line_number":68,"context_line":""},{"line_number":69,"context_line":"A few builtin tests should be provided as well - oslo.messaging / db / keystone"},{"line_number":70,"context_line":"would be good examples."}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_cd4c64a0","line":67,"in_reply_to":"1f9dbf25_0166eeee","updated":"2018-03-09 16:05:19.000000000","message":"I think running some things periodically so that the results are ready when requested will be great for some things, but I also fully expect there to be expensive tests which you wouldn\u0027t want to run very often, even on a periodic test. For these, give callers a way to request that they run, so they can schedule that when they want it (e.g. 2am), and come back later to ask for the results.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":7769,"name":"Pentheus","display_name":"Alan Meadows","email":"alan.meadows@gmail.com","username":"alanmeadows"},"change_message_id":"952336752a983b0c7a00345aa3e71b4d3c827ec0","unresolved":false,"context_lines":[{"line_number":64,"context_line":"* A base test plugin, that can be used to create custom tests per service."},{"line_number":65,"context_line":""},{"line_number":66,"context_line":"Splitting the tests and the reporting allows for more in-depth tests that may"},{"line_number":67,"context_line":"not want to run on every call (to try and avoid DoS attacks)."},{"line_number":68,"context_line":""},{"line_number":69,"context_line":"A few builtin tests should be provided as well - oslo.messaging / db / keystone"},{"line_number":70,"context_line":"would be good examples."}],"source_content_type":"text/x-rst","patch_set":7,"id":"3f79a3b5_ee8b7c14","line":67,"in_reply_to":"1f9dbf25_0166eeee","updated":"2018-09-11 23:02:45.000000000","message":"One example of this is dependent on whether the architecture mandates agent like services (e.g. nova-compute) are required to expose a health API or whether they are aggregated at a nova-api level.  In the latter case, you may not wan to poll potentially thousands of agents synchronously.  The major issue I see with support for asynchronous gathering is now we must store the results somewhere.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":8482,"name":"Colleen Murphy","email":"colleen@gazlene.net","username":"krinkle"},"change_message_id":"289973524ae9a4a4cec7a521961ef9e5ce4885bb","unresolved":false,"context_lines":[{"line_number":69,"context_line":"A few builtin tests should be provided as well - oslo.messaging / db / keystone"},{"line_number":70,"context_line":"would be good examples."},{"line_number":71,"context_line":""},{"line_number":72,"context_line":"These tests could be part of the oslo.db / oslo.messaging keystoneauth"},{"line_number":73,"context_line":"libraries, and should do basic checks (checking that the database / queue /"},{"line_number":74,"context_line":"keystone is reachable and the credentials are valid are good starting points.)"},{"line_number":75,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_f39e1db7","line":72,"range":{"start_line":72,"start_character":58,"end_line":72,"end_character":70},"updated":"2018-03-19 09:00:42.000000000","message":"I would object to adding more things to keystoneauth. We try to keep its dependency list as slim as possible and even so it already suffers from scope creep. We also strive for %100 backwards compatibility so if the health check test interface changes we\u0027d run into trouble. However, it would be super easy to write a wrapper around keystoneauth for this.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"259a565e1f02c8e0dc0d71177018852cff6a38c1","unresolved":false,"context_lines":[{"line_number":73,"context_line":"libraries, and should do basic checks (checking that the database / queue /"},{"line_number":74,"context_line":"keystone is reachable and the credentials are valid are good starting points.)"},{"line_number":75,"context_line":""},{"line_number":76,"context_line":"These plugins could be loaded dynamically by reading entrypoints which loads"},{"line_number":77,"context_line":"the custom tests (based on a base plugin that returns results in a very defined"},{"line_number":78,"context_line":"format)."},{"line_number":79,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_ed34c8fd","line":76,"range":{"start_line":76,"start_character":71,"end_line":76,"end_character":76},"updated":"2018-03-09 16:05:19.000000000","message":"nit: load","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10343,"name":"Jim Rollenhagen","email":"jim@jimrollenhagen.com","username":"jimrollenhagen"},"change_message_id":"818dc1d320f718f19e6096005e5c6e82dd2313ca","unresolved":false,"context_lines":[{"line_number":78,"context_line":"format)."},{"line_number":79,"context_line":""},{"line_number":80,"context_line":"This allows for flexibility for services, without causing tooling to be updated"},{"line_number":81,"context_line":"for each new service."},{"line_number":82,"context_line":""},{"line_number":83,"context_line":"These results will only be stored locally, and not as a timeseries set of"},{"line_number":84,"context_line":"data - there will always only be one set of results per test, along with a"}],"source_content_type":"text/x-rst","patch_set":7,"id":"1f9dbf25_a11b7a71","line":81,"updated":"2018-02-27 12:02:26.000000000","message":"I do think we should do some basic checks like this - have definitely run into times where my LB thinks a service is healthy, but the database is down.\n\nThat said, do we really need a plugin architecture to do such a thing? I\u0027d rather the middleware code hosts the checks, and the service tells the middleware explicitly what checks should be used.\n\nI especially don\u0027t want a service to start running new tests which contribute to the healthcheck, simply because some library gets installed (unrelated to the service). Imagine the openstack users which run all API services on a single host, with system packages. :)\n\nI imagine the API being something like:\n\n    def _check_baz():\n        # check a thing...\n\n    app \u003d _build_wsgi_app()\n\n    check_methods \u003d [\n        oslo_db.healthcheck,\n        oslo_messaging.healthcheck\n        my_service.check_foo,\n        my_service.check_bar,\n        _check_baz\n    ]\n\n    health_middleware \u003d HealthcheckMiddleware(app, check_methods)","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10343,"name":"Jim Rollenhagen","email":"jim@jimrollenhagen.com","username":"jimrollenhagen"},"change_message_id":"818dc1d320f718f19e6096005e5c6e82dd2313ca","unresolved":false,"context_lines":[{"line_number":83,"context_line":"These results will only be stored locally, and not as a timeseries set of"},{"line_number":84,"context_line":"data - there will always only be one set of results per test, along with a"},{"line_number":85,"context_line":"timestamp of when the result was updated. `designate service-status`_ is an"},{"line_number":86,"context_line":"example of this."},{"line_number":87,"context_line":""},{"line_number":88,"context_line":"Statuses"},{"line_number":89,"context_line":"^^^^^^^^"}],"source_content_type":"text/x-rst","patch_set":7,"id":"1f9dbf25_5e629118","line":86,"updated":"2018-02-27 12:02:26.000000000","message":"Where is this stored? In memory?","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":11628,"name":"Michael Johnson","email":"johnsomor@gmail.com","username":"johnsom"},"change_message_id":"e6cf7e266c6f2e8be9b96d97aeda29662aecf998","unresolved":false,"context_lines":[{"line_number":85,"context_line":"timestamp of when the result was updated. `designate service-status`_ is an"},{"line_number":86,"context_line":"example of this."},{"line_number":87,"context_line":""},{"line_number":88,"context_line":"Statuses"},{"line_number":89,"context_line":"^^^^^^^^"},{"line_number":90,"context_line":""},{"line_number":91,"context_line":".. note::"}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_977b4926","line":88,"range":{"start_line":88,"start_character":0,"end_line":88,"end_character":8},"updated":"2018-03-15 16:54:13.000000000","message":"As mentioned below, I think this needs to be better defined.\nIs this overall service health? I.e. the control plane (and all of it\u0027s processes) is functional enough for a user to be able to consume the service?","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10850,"name":"German Eichberger","email":"german.eichberger@gmail.com","username":"german"},"change_message_id":"c376023d908f1f6fa17f9d877776a674c3c4777a","unresolved":false,"context_lines":[{"line_number":97,"context_line":""},{"line_number":98,"context_line":"* HEALTHY"},{"line_number":99,"context_line":"* UNHEALTHY"},{"line_number":100,"context_line":"* INITIALISING"},{"line_number":101,"context_line":""},{"line_number":102,"context_line":"Test Plugin"},{"line_number":103,"context_line":"-----------"}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_b597d790","line":100,"range":{"start_line":100,"start_character":2,"end_line":100,"end_character":14},"updated":"2018-03-21 22:55:51.000000000","message":"NIT: initialiZing","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"259a565e1f02c8e0dc0d71177018852cff6a38c1","unresolved":false,"context_lines":[{"line_number":129,"context_line":"\"UNHEALTHY\"."},{"line_number":130,"context_line":""},{"line_number":131,"context_line":""},{"line_number":132,"context_line":"This is used when looking up the results when an API call arrives, so the #"},{"line_number":133,"context_line":"following flow will happen:"},{"line_number":134,"context_line":""},{"line_number":135,"context_line":"1. Lookup test result"}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_c0d3cb37","line":132,"range":{"start_line":132,"start_character":74,"end_line":132,"end_character":75},"updated":"2018-03-09 16:05:19.000000000","message":"typo?","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"259a565e1f02c8e0dc0d71177018852cff6a38c1","unresolved":false,"context_lines":[{"line_number":133,"context_line":"following flow will happen:"},{"line_number":134,"context_line":""},{"line_number":135,"context_line":"1. Lookup test result"},{"line_number":136,"context_line":"2. Is test results timestamp + tests timestampeout \u003c current time?"},{"line_number":137,"context_line":"2.1 If Yes - test is healthy"},{"line_number":138,"context_line":"2.2 If No - test is considered unhealthy."},{"line_number":139,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_a0aed795","line":136,"range":{"start_line":136,"start_character":37,"end_line":136,"end_character":50},"updated":"2018-03-09 16:05:19.000000000","message":"typo?","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":11564,"name":"Chris Dent","email":"cdent@anticdent.org","username":"chdent"},"change_message_id":"1da1ac9d4eb2b6b96ceb6d7007e01990a5399a19","unresolved":false,"context_lines":[{"line_number":163,"context_line":"   so this should be used as a base until it is approved."},{"line_number":164,"context_line":""},{"line_number":165,"context_line":"It will be versioned completely separately from the services API, as the"},{"line_number":166,"context_line":"version **should** be consistent across an OpenStack deployment, and will"},{"line_number":167,"context_line":"refer to the structure of the healthcheck response. The versions available may"},{"line_number":168,"context_line":"be in flux as upgrades happen, but it should be possible to have a consistent"},{"line_number":169,"context_line":"version across services."}],"source_content_type":"text/x-rst","patch_set":7,"id":"3fa0c359_794caec7","line":166,"range":{"start_line":166,"start_character":33,"end_line":166,"end_character":63},"updated":"2018-02-21 18:18:10.000000000","message":"I assume this is because tooling that inspects the output would like to be the same?","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":11628,"name":"Michael Johnson","email":"johnsomor@gmail.com","username":"johnsom"},"change_message_id":"e6cf7e266c6f2e8be9b96d97aeda29662aecf998","unresolved":false,"context_lines":[{"line_number":185,"context_line":"      Content-Type: text/javascript"},{"line_number":186,"context_line":""},{"line_number":187,"context_line":"      {"},{"line_number":188,"context_line":"        \"status\": \"HEALTHY\","},{"line_number":189,"context_line":"        \"test_failures\": 0,"},{"line_number":190,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":191,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":192,"context_line":"        \"service\": \"designate-api\","}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_17a739bf","line":189,"range":{"start_line":188,"start_character":0,"end_line":189,"end_character":27},"updated":"2018-03-15 16:54:13.000000000","message":"I think we need more wording around what this \"status\" means.  I.e. \"test_failures\" can be \u003e0 and still be HEALTHY.\nI think this status should be defined as the minimum required components are functioning such that a user of the API can successful consume the service.\nThis could mean that there is only one combination of services still functioning and the rest is on fire.\nProjects would have to define the logic for this that says, using octavia for example: At least one API, one worker, and one health manager are healthy. Which, behind the scenes means they can get to DB, keystone, messaging, etc.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":11564,"name":"Chris Dent","email":"cdent@anticdent.org","username":"chdent"},"change_message_id":"1da1ac9d4eb2b6b96ceb6d7007e01990a5399a19","unresolved":false,"context_lines":[{"line_number":188,"context_line":"        \"status\": \"HEALTHY\","},{"line_number":189,"context_line":"        \"test_failures\": 0,"},{"line_number":190,"context_line":"        \"message\": \"Service is running normally\","},{"line_number":191,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":192,"context_line":"        \"service\": \"designate-api\","},{"line_number":193,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","},{"line_number":194,"context_line":"        \"hostname\": \"api-0001-us1az1.designate.example.com\","}],"source_content_type":"text/x-rst","patch_set":7,"id":"3fa0c359_b9db16eb","line":191,"updated":"2018-02-21 18:18:10.000000000","message":"would a last-modified header work for this instead","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":11628,"name":"Michael Johnson","email":"johnsomor@gmail.com","username":"johnsom"},"change_message_id":"e6cf7e266c6f2e8be9b96d97aeda29662aecf998","unresolved":false,"context_lines":[{"line_number":191,"context_line":"        \"date\": \"2018-01-04T15:10:43.511Z\","},{"line_number":192,"context_line":"        \"service\": \"designate-api\","},{"line_number":193,"context_line":"        \"service_id\": \"af91edb5-ede8-453f-af13-feabdd088f9c\","},{"line_number":194,"context_line":"        \"hostname\": \"api-0001-us1az1.designate.example.com\","},{"line_number":195,"context_line":"        \"created_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":196,"context_line":"        \"heartbeated_at\": \"2018-01-01T12:16:29.511Z\","},{"line_number":197,"context_line":"        \"tests\": {"}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_37491d5f","line":194,"range":{"start_line":194,"start_character":8,"end_line":194,"end_character":59},"updated":"2018-03-15 16:54:13.000000000","message":"To me this is the most \"sensitive\" thing in here. Maybe make this optional?","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":11564,"name":"Chris Dent","email":"cdent@anticdent.org","username":"chdent"},"change_message_id":"1da1ac9d4eb2b6b96ceb6d7007e01990a5399a19","unresolved":false,"context_lines":[{"line_number":234,"context_line":"very basic set of logic, where ``status code`` \u003c 400 \u003d\u003d good, and \u003e 400 \u003d\u003d bad."},{"line_number":235,"context_line":""},{"line_number":236,"context_line":"E.g. `Kubernetes Liveness Probes`_ or `HAProxy HTTP Checks`_ either use the raw"},{"line_number":237,"context_line":"status code, or a regex on the body."},{"line_number":238,"context_line":""},{"line_number":239,"context_line":".. note:: The exact status code is open for debate, but it should be \u003e\u003d 400"},{"line_number":240,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"3fa0c359_145c4feb","line":237,"updated":"2018-02-21 18:18:10.000000000","message":"What, then, is the value of so much information in the response? The degree of detail in granularity is surprisingly high and the use for it not clearly described.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10343,"name":"Jim Rollenhagen","email":"jim@jimrollenhagen.com","username":"jimrollenhagen"},"change_message_id":"818dc1d320f718f19e6096005e5c6e82dd2313ca","unresolved":false,"context_lines":[{"line_number":234,"context_line":"very basic set of logic, where ``status code`` \u003c 400 \u003d\u003d good, and \u003e 400 \u003d\u003d bad."},{"line_number":235,"context_line":""},{"line_number":236,"context_line":"E.g. `Kubernetes Liveness Probes`_ or `HAProxy HTTP Checks`_ either use the raw"},{"line_number":237,"context_line":"status code, or a regex on the body."},{"line_number":238,"context_line":""},{"line_number":239,"context_line":".. note:: The exact status code is open for debate, but it should be \u003e\u003d 400"},{"line_number":240,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"1f9dbf25_7ec2edeb","line":237,"in_reply_to":"3fa0c359_145c4feb","updated":"2018-02-27 12:02:26.000000000","message":"++","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":11564,"name":"Chris Dent","email":"cdent@anticdent.org","username":"chdent"},"change_message_id":"1da1ac9d4eb2b6b96ceb6d7007e01990a5399a19","unresolved":false,"context_lines":[{"line_number":245,"context_line":""},{"line_number":246,"context_line":"* Most healthchecking systems do not have the ability for complex logic"},{"line_number":247,"context_line":"  required for keystone authentication"},{"line_number":248,"context_line":"* A failed keystone could mask other failures"},{"line_number":249,"context_line":""},{"line_number":250,"context_line":".. note::"},{"line_number":251,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"3fa0c359_14f52fed","line":248,"updated":"2018-02-21 18:18:10.000000000","message":"On the etherpad someone said \"No no no, can\u0027t just leave it open.\"\nhttps://etherpad.openstack.org/p/sydney-cloud-native-partii","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"259a565e1f02c8e0dc0d71177018852cff6a38c1","unresolved":false,"context_lines":[{"line_number":245,"context_line":""},{"line_number":246,"context_line":"* Most healthchecking systems do not have the ability for complex logic"},{"line_number":247,"context_line":"  required for keystone authentication"},{"line_number":248,"context_line":"* A failed keystone could mask other failures"},{"line_number":249,"context_line":""},{"line_number":250,"context_line":".. note::"},{"line_number":251,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_ede048cf","line":248,"in_reply_to":"1f9dbf25_9efc993f","updated":"2018-03-09 16:05:19.000000000","message":"Good suggestions. Would it be possible to require auth on the external network but not on the internal network? Granted some callers of these APIs will not be able to auth, but some will, and why prohibit them (e.g. an end user) from using this when they\u0027re not on the internal network (assuming auth is required  for anything not internal, which it should be)?\n\nAlso \"A failed keystone could mask other failures\" would just be a reason to have another status code that you can show for things you weren\u0027t able to check because of the keystone issue, not a reason to avoid auth.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10343,"name":"Jim Rollenhagen","email":"jim@jimrollenhagen.com","username":"jimrollenhagen"},"change_message_id":"818dc1d320f718f19e6096005e5c6e82dd2313ca","unresolved":false,"context_lines":[{"line_number":245,"context_line":""},{"line_number":246,"context_line":"* Most healthchecking systems do not have the ability for complex logic"},{"line_number":247,"context_line":"  required for keystone authentication"},{"line_number":248,"context_line":"* A failed keystone could mask other failures"},{"line_number":249,"context_line":""},{"line_number":250,"context_line":".. note::"},{"line_number":251,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"1f9dbf25_9efc993f","line":248,"in_reply_to":"3fa0c359_14f52fed","updated":"2018-02-27 12:02:26.000000000","message":"I do agree that we can\u0027t really put auth on this, thanks to the tooling we expect to leverage this. But, also agree that some people won\u0027t want it available to all users. Do we:\n\n1) recommend that /healthcheck is restricted somehow (e.g. not exposed in the load balancer, so only available on the internal network for the control plane)\n\n2) have a \u0027detail\u0027 config parameter, defaulting to False, which controls exposing the detailed info (detail\u003dFalse would do something like just return the status code).\n\n3) both?","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":8482,"name":"Colleen Murphy","email":"colleen@gazlene.net","username":"krinkle"},"change_message_id":"289973524ae9a4a4cec7a521961ef9e5ce4885bb","unresolved":false,"context_lines":[{"line_number":245,"context_line":""},{"line_number":246,"context_line":"* Most healthchecking systems do not have the ability for complex logic"},{"line_number":247,"context_line":"  required for keystone authentication"},{"line_number":248,"context_line":"* A failed keystone could mask other failures"},{"line_number":249,"context_line":""},{"line_number":250,"context_line":".. note::"},{"line_number":251,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_102dc7a7","line":248,"in_reply_to":"df7087c5_b720cd9c","updated":"2018-03-19 09:00:42.000000000","message":"Agreed with the reasons for not using keystone auth, and additionally the way keystone does AuthZ is overkill for this. Something like SSL client certs or basic/digest auth seems perfectly reasonable.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":11628,"name":"Michael Johnson","email":"johnsomor@gmail.com","username":"johnsom"},"change_message_id":"e6cf7e266c6f2e8be9b96d97aeda29662aecf998","unresolved":false,"context_lines":[{"line_number":245,"context_line":""},{"line_number":246,"context_line":"* Most healthchecking systems do not have the ability for complex logic"},{"line_number":247,"context_line":"  required for keystone authentication"},{"line_number":248,"context_line":"* A failed keystone could mask other failures"},{"line_number":249,"context_line":""},{"line_number":250,"context_line":".. note::"},{"line_number":251,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_b720cd9c","line":248,"in_reply_to":"df7087c5_ede048cf","updated":"2018-03-15 16:54:13.000000000","message":"I think the user should be able to use SSL client cert authentication if they want. This would allow authenticated service-to-service communication in a standard (not keystone) way.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"259a565e1f02c8e0dc0d71177018852cff6a38c1","unresolved":false,"context_lines":[{"line_number":247,"context_line":"  required for keystone authentication"},{"line_number":248,"context_line":"* A failed keystone could mask other failures"},{"line_number":249,"context_line":""},{"line_number":250,"context_line":".. note::"},{"line_number":251,"context_line":""},{"line_number":252,"context_line":"   Currently, this middleware will only work for services that expose an API,"},{"line_number":253,"context_line":"   which only covers a small section of OpenStack services - many of them do"}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_803b93e9","line":250,"updated":"2018-03-09 16:05:19.000000000","message":"important note, but doesn\u0027t really belong in the auth section","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"259a565e1f02c8e0dc0d71177018852cff6a38c1","unresolved":false,"context_lines":[{"line_number":253,"context_line":"   which only covers a small section of OpenStack services - many of them do"},{"line_number":254,"context_line":"   not - e.g.:"},{"line_number":255,"context_line":""},{"line_number":256,"context_line":"   * nova-compute"},{"line_number":257,"context_line":"   * designate-(?!api)"},{"line_number":258,"context_line":"   * neutron-l3-agent"},{"line_number":259,"context_line":"   * many many more."}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_e3d5b944","line":256,"updated":"2018-03-09 16:05:19.000000000","message":"I think it would help get reviewers thinking to call out nova-conductor and nova-scheduler here as well.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10343,"name":"Jim Rollenhagen","email":"jim@jimrollenhagen.com","username":"jimrollenhagen"},"change_message_id":"818dc1d320f718f19e6096005e5c6e82dd2313ca","unresolved":false,"context_lines":[{"line_number":256,"context_line":"   * nova-compute"},{"line_number":257,"context_line":"   * designate-(?!api)"},{"line_number":258,"context_line":"   * neutron-l3-agent"},{"line_number":259,"context_line":"   * many many more."},{"line_number":260,"context_line":""},{"line_number":261,"context_line":"   These services should be extended to expose a HTTP service, which runs this"},{"line_number":262,"context_line":"   middleware."}],"source_content_type":"text/x-rst","patch_set":7,"id":"1f9dbf25_de0c6144","line":259,"updated":"2018-02-27 12:02:26.000000000","message":"I\u0027d prefer these services disable themselves when they aren\u0027t working, and report that to the DB or zookeeper or whatever is tracking which worker services are available.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10608,"name":"Matthew Edmonds","email":"edmondsw@us.ibm.com","username":"edmondsw"},"change_message_id":"259a565e1f02c8e0dc0d71177018852cff6a38c1","unresolved":false,"context_lines":[{"line_number":256,"context_line":"   * nova-compute"},{"line_number":257,"context_line":"   * designate-(?!api)"},{"line_number":258,"context_line":"   * neutron-l3-agent"},{"line_number":259,"context_line":"   * many many more."},{"line_number":260,"context_line":""},{"line_number":261,"context_line":"   These services should be extended to expose a HTTP service, which runs this"},{"line_number":262,"context_line":"   middleware."}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_636f6992","line":259,"in_reply_to":"1f9dbf25_de0c6144","updated":"2018-03-09 16:05:19.000000000","message":"and then have this new API surface that? Sure, except that automation then has to parse the body to know what to restart (or whatever the corrective action is) since the return code is no longer sufficient.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":11564,"name":"Chris Dent","email":"cdent@anticdent.org","username":"chdent"},"change_message_id":"1da1ac9d4eb2b6b96ceb6d7007e01990a5399a19","unresolved":false,"context_lines":[{"line_number":259,"context_line":"   * many many more."},{"line_number":260,"context_line":""},{"line_number":261,"context_line":"   These services should be extended to expose a HTTP service, which runs this"},{"line_number":262,"context_line":"   middleware."},{"line_number":263,"context_line":""},{"line_number":264,"context_line":"Reviewer activity"},{"line_number":265,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":7,"id":"3fa0c359_f9699ef7","line":262,"updated":"2018-02-21 18:18:10.000000000","message":"I\u0027m not sure this is a good idea. We should be striving to make services smaller with fewer ways in exposed. If nova-compute needs an HTTP based health agent then maybe it should be a process that runs to the side of nova-compute, etc?\n\nAt which point what we are talking about is on-host monitoring agents and it gets a bit confusing where the line is between this proposal and generic monitoring agents.\n\nHowever, I\u0027m happy to be told that \"adding a thread-like thing doing HTTP\" is the cloud-native way to achieve observability. If so there\u0027s quite a lot of work to do.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":11628,"name":"Michael Johnson","email":"johnsomor@gmail.com","username":"johnsom"},"change_message_id":"e6cf7e266c6f2e8be9b96d97aeda29662aecf998","unresolved":false,"context_lines":[{"line_number":259,"context_line":"   * many many more."},{"line_number":260,"context_line":""},{"line_number":261,"context_line":"   These services should be extended to expose a HTTP service, which runs this"},{"line_number":262,"context_line":"   middleware."},{"line_number":263,"context_line":""},{"line_number":264,"context_line":"Reviewer activity"},{"line_number":265,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_37e59d38","line":262,"in_reply_to":"3fa0c359_f9699ef7","updated":"2018-03-15 16:54:13.000000000","message":"I expect adding network paths to all of the services to expose an HTTP endpoint is also going to be an issue for some deployments. Some of our services are deployed in containers that currently only reach out (DB, messaging, keystone) and don\u0027t need inbound network paths.\nI think I would lean towards the services updating a DB table and having the API processes pull the other service info from there.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"},{"author":{"_account_id":10850,"name":"German Eichberger","email":"german.eichberger@gmail.com","username":"german"},"change_message_id":"c376023d908f1f6fa17f9d877776a674c3c4777a","unresolved":false,"context_lines":[{"line_number":259,"context_line":"   * many many more."},{"line_number":260,"context_line":""},{"line_number":261,"context_line":"   These services should be extended to expose a HTTP service, which runs this"},{"line_number":262,"context_line":"   middleware."},{"line_number":263,"context_line":""},{"line_number":264,"context_line":"Reviewer activity"},{"line_number":265,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":7,"id":"df7087c5_ab71b3e0","line":262,"in_reply_to":"df7087c5_37e59d38","updated":"2018-03-21 22:55:51.000000000","message":"As said above we mostly need a response which signifies Healthy/Unhealthy. If we follow the k8s example they also support simple TCP endpoint among other things. I think having some alternative TCP socket might be a. more lightweight alternative we can explore for simpler services. I really think a big selling point of this proposal is uniformity so excluding services because HTTP is too heavy fro them doesn\u0027t seem right.\n\n@johnsom despite most of our services using the DB it\u0027s conceivable that even that might be too heavy for some.","commit_id":"e21c5f57482ba2a12f1f1823baaf447f58a26fcb"}]}
