Pre-generate mysql ORM code for sqoop
Closed, ResolvedPublic13 Estimated Story Points
Actions

Assigned To

Authored By

	Milimetric
	Aug 16 2016, 4:07 PM

Description

sqoop by default will query the database it's hooked up to in order to generate a bunch of Java files and transfer data. But it also takes a parameter to use a pre-generated jar. If we pass this it will make our sqoop jobs faster.

Details

Subject	Repo	Branch	Lines +/-
Add README.mediawiki-tables-sqoop-orm	analytics/refinery	master	+11 -0
Sqoop using the pre-generated orm jar	operations/puppet	production	+2 -1
Add --generate-jar and --jar-file options	analytics/refinery	master	+100 -51

Customize query in gerrit

Event Timeline

Milimetric created this task.Aug 16 2016, 4:07 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 16 2016, 4:07 PM

Milimetric triaged this task as Medium priority.Aug 22 2016, 3:35 PM

Milimetric moved this task from Incoming to Backlog (Later) on the Analytics board.

@Milimetric : is this still relrevant for Q2 if we run scoop once a week?

Oh yeah, it's relevant even if we run sqoop once, because for every table in every database it repeats the column detection process (so like 5000 times every run). I should have probably not skipped it in the first place but I was afraid I'd find different schemas on different dbs (which is actually the case) so I didn't want to hold up the sqooping task any longer.

Milimetric added a project: good first task.Nov 28 2016, 4:56 PM

Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptNov 28 2016, 4:56 PM

• Nuria moved this task from Backlog (Later) to Operational Excellence Future on the Analytics board.Mar 13 2017, 3:59 PM

we need a jar with bindings for the MySQL schema. How we generate this?
we could run script with one parameter --generate-jar or do this automatically when we deploy refinery, jars would need to be somewhere where script can find them at runtime. Script needs to be run on 1002 to be able to generate bindings from MySQL

changing scoop job to have a parameter that passes jar along

passing jar to scoop job that will generate ORM code

• Nuria edited projects, added Analytics-Kanban; removed good first task, Analytics.Apr 6 2017, 4:21 PM

• Nuria set the point value for this task to 13.

Ping @Milimetric lower priority than our design but if you feel you ned to grab an item you could do this one

Milimetric claimed this task.Apr 21 2017, 10:03 PM

Milimetric moved this task from Next Up to In Progress on the Analytics-Kanban board.

Change 349723 had a related patch set uploaded (by Milimetric):
[analytics/refinery@master] [WIP] Add just-generate-jar and jar-file options

https://gerrit.wikimedia.org/r/349723

gerritbot added a project: Patch-For-Review.Apr 22 2017, 1:41 AM

Milimetric moved this task from In Progress to In Code Review on the Analytics-Kanban board.Apr 24 2017, 8:38 PM

Change 349723 merged by Ottomata:
[analytics/refinery@master] Add --generate-jar and --jar-file options

https://gerrit.wikimedia.org/r/349723

Change 351667 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery@master] Add README.mediawiki-tables-sqoop-orm

https://gerrit.wikimedia.org/r/351667

JAllemandou moved this task from In Code Review to Ready to Deploy on the Analytics-Kanban board.May 4 2017, 8:42 AM

JAllemandou moved this task from Ready to Deploy to In Code Review on the Analytics-Kanban board.May 4 2017, 8:58 AM

Change 351857 had a related patch set uploaded (by Milimetric; owner: Milimetric):
[operations/puppet@production] Sqoop using the pre-generated orm jar

https://gerrit.wikimedia.org/r/351857

Milimetric moved this task from In Code Review to Ready to Deploy on the Analytics-Kanban board.May 5 2017, 3:06 PM

Change 351857 merged by Elukey:
[operations/puppet@production] Sqoop using the pre-generated orm jar

https://gerrit.wikimedia.org/r/351857

Change 351667 merged by Ottomata:
[analytics/refinery@master] Add README.mediawiki-tables-sqoop-orm

https://gerrit.wikimedia.org/r/351667

• Nuria moved this task from Ready to Deploy to Done on the Analytics-Kanban board.May 25 2017, 3:52 PM

• Nuria closed this task as Resolved.May 30 2017, 10:51 PM

Pre-generate mysql ORM code for sqoopClosed, ResolvedPublic13 Estimated Story PointsActions

Description

Details

Event Timeline

Pre-generate mysql ORM code for sqoop
Closed, ResolvedPublic13 Estimated Story Points
Actions