Page MenuHomePhabricator

Configure Graphoid Logstash Dashboard
Closed, ResolvedPublic

Description

Graphoid does not seem to appear in the Logstash - either because it is perfect, or because something is not working right. A proper dashboard is needed.

Event Timeline

Yurik raised the priority of this task from to Needs Triage.
Yurik updated the task description. (Show Details)
Yurik added projects: Services, Graphoid.
Yurik added subscribers: mobrovac, GWicke, Aklapper, Yurik.
Yurik set Security to None.

[The graphoid conifig.yaml](https://github.com/wikimedia/operations-puppet/blob/production/modules/graphoid/templates
/config.yaml.erb) is missing a service-runner logging stanza [similar to restbase](https://github.com/wikimedia/op
erations-puppet/blob/production/modules/restbase/templat
es/config.yaml.erb#L11-L18).

[The graphoid conifig.yaml](https://github.com/wikimedia/operations-puppet/blob/production/modules/graphoid/templates
/config.yaml.erb) is missing a service-runner logging stanza [similar to restbase](https://github.com/wikimedia/op
erations-puppet/blob/production/modules/restbase/templat
es/config.yaml.erb#L11-L18).

That's not it. Graphoid uses service::node, which provides common config bits for all services, inlcuding metrics and logging. We need to investigate further what's going on here.

@mobrovac, I see.

It would be nice if service::node was following the regular service-runner config layout, as this would

  • keep all relevant configs for a service in one place,
  • avoid confusion about many different ways to do the same thing, and
  • retain support for running multiple services in a single service-runner instance.

I'm actually ok with having "generic" and "custom" parts, as long as the generic one works :) Would be good if there was an easy way to verify that it works. Can't wait for my sca1001 access - there has been several 600 usage spikes (compared to ~4-10 on average. I suspect its due to huwiki - they have added an automated graph to every city.

I think @mobrovac and me can work on fixing this.

@GWicke: service::node is an attempt (I guess pretty successful) at standardizing the way we deploy services, and to not repeat the same configs over and over. In the end, all services that use a "standard" structure (like service-runner) should be served via it. Finally, I don't think we'll ever want to run different services from the same service-runner instance in production, and coding for that would make our code both more complex and uglier.

@Joe, I see why you went for separate config files, and am okay with that as long as we can integrate that with the regular service-runner config management. One option we could consider is an include syntax for per-service configs, which would keep your separate config file working, while also avoiding a proliferation of custom config loading.

mobrovac raised the priority of this task from Low to High.
mobrovac removed a subscriber: Aklapper.

Change 211955 had a related patch set uploaded (by Mobrovac):
service::node: fix logstash port

https://gerrit.wikimedia.org/r/211955

Change 211955 merged by Giuseppe Lavagetto:
service::node: fix logstash port

https://gerrit.wikimedia.org/r/211955

The problem was in the wrong logstash port. It's all good now and Graphoid's dashboard can be found at https://logstash.wikimedia.org/#/dashboard/elasticsearch/Graphoid . Note that it is currently empty as only warn, error and fatal log messages are sent there. For lower, info-level messages, you can consult /var/log/graphoid/main.log on the SCA nodes.

Resolving as there's nothing left to do.

Thank you @Yurik for raising this issue, as it affected Citoid as well.

@Joe, I see why you went for separate config files, and am okay with that as long as we can integrate that with the regular service-runner config management.

Indeed it does. The way service::node compiles the configuration file is that it uses the bits and pieces common to all SCA services (logging, metrics), and combines them with user-provided per-service config (the services[0].conf part in the config.yaml). The only thing it is not able to conform to (wrt. service-runner) is running multiple services using the same service-runner instance, but as @Joe pointed out, it is unlikely this will happen in WMF production, more concretely on the SCA cluster.

One option we could consider is an include syntax for per-service configs, which would keep your separate config file working, while also avoiding a proliferation of custom config loading.

Right. To make it more general we could include a flag in service::node, something like $is_full_config which, if set to true, does not do the usual merging, but simply reads out the user-supplied configuration as a whole.

@mobrovac I don't see a reason to allow our users to shoot themselves in the foot, but maybe you and @GWicke see a use-case I don't see.

I mean - if someone wants to create a completely custom service installation, the right way to go is to create an independent module and not to use service::node.

In T97615#1296041, @Joe wrote:

@mobrovac I don't see a reason to allow our users to shoot themselves in the foot, but maybe you and @GWicke see a use-case I don't see.

I was thinking here about being able to make RESTBase use service::node. There are two things which currently do not make that possible:

  1. service::node uses Upstart scripts only. This can be fixed by making use of base::service_unit and pick_init_script and supplying a SystemD script. This is a step we should take either way because of T96017
  2. One thing that is common to all SCA services is that their entry point is src/app.js, while RESTBase's is restbase/lib/server.js. This is the reason why I'd be in favour of users being able to provide their configs. But now thinking about it, we could either:
    • symlink src/app.js to restbase/lib/server.js - uglier but easy
    • allow the user to provide only the entry point separately to service::node, something like $entry_point
In T97615#1296042, @Joe wrote:

I mean - if someone wants to create a completely custom service installation, the right way to go is to create an independent module and not to use service::node.

Good point :) I concur.