Page MenuHomePhabricator

Cleanup refinery artifacts folder from unneeded jars
Closed, ResolvedPublic5 Estimated Story Points

Description

Cleanup refinery artifacts folder from unneeded jars

For both in deployment depo and hadoop cluster . Right now we have about 8G of jars of which likely we only need half.

Event Timeline

Audit of versioned jar files needed (non-versioned files should always be there):

  • In puppet repo:
jarDefined in
camus-wmf-0.1.0-wmf9.jarmodules/profile/manifests/analytics/refinery/job/camus.pp and test/camus.pp
refinery-camus-0.0.90.jarmodules/profile/manifests/analytics/refinery/job/camus.pp and test/camus.pp
refinery-job-0.0.83.jarmodules/profile/manifests/analytics/refinery/job/druid_load.pp
refinery-job-0.0.94.jarmodules/profile/manifests/analytics/refinery/job/refine.pp
refinery-job-0.0.97.jarmodules/profile/manifests/analytics/refinery/job/test/refine.pp
  • In refinery repo:
jarDefined in
refinery-job-0.0.61.jaroozie/apis/coordinator.properties
refinery-cassandra-0.0.35.jaroozie/cassandra/bundle.properties and associated coordinators and historical workflow
refinery-hive-0.0.53.jaroozie/cassandra/bundle.properties and associated coordinators
refinery-job-0.0.89.jaroozie/clickstream/coordinator.properties
refinery-hive-0.0.93.jarspread between oozie/data_quality/coordinator.properties (version) and eventcapsule_metrics.hql(file)
refinery-hive-0.0.53.jarspread between oozie/interlanguage/daily/coordinator.properties (version) and interlanguage_navigation.hql(file)
refinery-hive-0.0.41.jaroozie/mediacounts/load/insert_hourly_mediacounts.hql
refinery-hive-0.0.98.jarspread between oozie/mediarequest/hourly/coordinator.properties (version) and mediarequest_hourly.hql(file)
refinery-hive-0.0.53.jarspread between oozie/mediawiki/geoeditors/monthly/coordinator.properties (version) and insert_geoeditors_daily_data.hql(file)
refinery-job-0.0.93.jaroozie/mediawiki/history/check_denormalize/coordinator.properties
refinery-job-0.0.93.jaroozie/mediawiki/history/denormalize/coordinator.properties
refinery-job-0.0.88.jaroozie/mediawiki/history/reduced/coordinator.properties
refinery-job-0.0.85.jaroozie/mediawiki/history/wikitext/coordinator.properties
refinery-job-0.0.61.jaroozie/mobile_apps/session_metrics/coordinator.properties
refinery-hive-0.0.24.jarspread between oozie/projectview/geo/coordinator.properties (version) and archive_projectview_geo_hourly.hql(file)
refinery-hive-0.0.46.jaroozie/unique_devices/per_project_family/daily/coordinator.properties
refinery-hive-0.0.46.jaroozie/unique_devices/per_project_family/monthly/coordinator.properties
refinery-hive-0.0.58.jarspread between oozie/virtualpageview/hourly/coordinator.properties (version) and virtualpageview_hourly.hql(file)
refinery-hive-0.0.94.jarspread between oozie/webrequest/load/bundle.properties (version) and refine_webrequest.hql(file)
refinery-job-0.0.80.jaroozie/webrequest/subset/coordinator.properties
refinery-job-0.0.62.jaroozie/wikidata/articleplaceholder_metrics/coordinator.properties
refinery-job-0.0.89.jaroozie/wikidata/coeditors_metrics/coordinator.properties
refinery-job-0.0.61.jaroozie/wikidata/specialentitydata_metrics/coordinator.properties

Question for @Ottomata and @Nuria : Do we prefer to move old jars to new ones and get rid of every jar older than version X, or do we get rid of currently-unused jars only (which already represents a huge win)?

I think removed unused jars would work for now, right?

Ya let's just remove all currently unused.

Change 534611 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery@master] Cleanup artifacts folder

https://gerrit.wikimedia.org/r/534611

JAllemandou set the point value for this task to 5.

Change 534611 merged by Nuria:
[analytics/refinery@master] Cleanup artifacts folder

https://gerrit.wikimedia.org/r/534611