Swiftrepl was running through repl_all.sh on ms-fe1005 and was stuck in an infinite loop since days, also creating a big log file (>1GB).
The cause seems to be 2 files with mismatched ETAG in 2 different containers: wikipedia-commons-local-thumb.56 and wikipedia-commons-local-thumb.03. See the related logs here below. This was causing the script to infinitely loop over those 2 containers, stopping always at the same point and forbidding the script to go ahead with the other containers.
STATS: wikipedia-commons-local-thumb.56 processed: 61000/3413461 (1%), gets: 0, hit rate: 100% wikipedia-commons-local-thumb.56 5/56/19670630_35_LI_211_New_York_(11974786273).jpg/180px-19670630_35_LI_211_New_York_(11974786273).jpg E-Tag mis match: c7411afe7ef90b2aae5fe1cab5b69328/519012036565348ab6c07cf7756362fd, syncing transferred 6360 out of 9900 for 5/56/19670630_35_LI_211_New_York_(11974786273).jpg/180px-19670630_35_LI_211_New_York_(11974786273).jpg transferred 6360 out of 9900 for 5/56/19670630_35_LI_211_New_York_(11974786273).jpg/180px-19670630_35_LI_211_New_York_(11974786273).jpg Repeated error in replicate_object Traceback (most recent call last): File "./swiftrepl.py", line 473, in replicator_thread sync_container(container, kwargs['srcconnpool'], kwargs['dstconnpool']) File "./swiftrepl.py", line 332, in sync_container replicate_object(srcobj, dstobj, srcconnpool, dstconnpool) File "./swiftrepl.py", line 185, in replicate_object send_object(dstobj, object_stream(response, chunksize=65536), headers) File "./swiftrepl.py", line 137, in send_object raise cloudfiles.errors.IncompleteSend() IncompleteSend Abandoning container wikipedia-commons-local-thumb.56 for now
STATS: wikipedia-commons-local-thumb.03 processed: 104000/3431616 (3%), gets: 0, hit rate: 99% wikipedia-commons-local-thumb.03 0/03/2014.06.18_maz-54329.JPG/1920px-2014.06.18_maz-54329.JPG E-Tag mismatch: 70c5b8f3b38f277f80f50a4aadc7d66b/ 0080a2f0746c8da110a7504a49004b2a, syncing transferred 259747 out of 259758 for 0/03/2014.06.18_maz-54329.JPG/1920px-2014.06.18_maz-54329.JPG transferred 259747 out of 259758 for 0/03/2014.06.18_maz-54329.JPG/1920px-2014.06.18_maz-54329.JPG Repeated error in replicate_object Traceback (most recent call last): File "./swiftrepl.py", line 473, in replicator_thread sync_container(container, kwargs['srcconnpool'], kwargs['dstconnpool']) File "./swiftrepl.py", line 332, in sync_container replicate_object(srcobj, dstobj, srcconnpool, dstconnpool) File "./swiftrepl.py", line 185, in replicate_object send_object(dstobj, object_stream(response, chunksize=65536), headers) File "./swiftrepl.py", line 137, in send_object raise cloudfiles.errors.IncompleteSend() IncompleteSend Abandoning container wikipedia-commons-local-thumb.03 for now
To deploy the discovery URLs to swift-proxy I had to stop it in order to restart the swift-proxy on this host. I've then restarted it and I'll monitor it in the next hours/days to see if the behaviour is the same or is able to get pass those two files.