current situation:
* codfw: swift 1.13.1 (icehouse)1/2.2.0 on trusty/jessie
* eqiad: swift 1.13.1 (icehouse)1/2.2.0 on precise (save for 6 machines on trusty)trusty/jessie
* esams: swift 1.13.1 (icehouse)2.2.0 on precisjessie (cluster not used)
swift changelog: https://github.com/openstack/swift/blob/master/CHANGELOG
jessie ships with swift 2.2 (juno) though the latest upstream versionversion in backports is 2.5 (liberty7 (mitaka)
```lines=10,name=changelog diff 2.2.0 -> 2.57.0
swift (2.7.0, OpenStack Mitaka)
* Bump PyECLib requirement to >= 1.2.0
* Update container on fast-POST
"Fast-POST" is the mode where `object_post_as_copy` is set to
`False` in the proxy server config. This mode now allows for
fast, efficient updates of metadata without needing to fully
recopy the contents of the object. While the default still is
`object_post_as_copy` as True, the plan is to change the default
to False and then deprecate post-as-copy functionality in later
releases. Fast-POST now supports container-sync functionality.
* Add concurrent reads option to proxy.
This change adds 2 new parameters to enable and control concurrent
GETs in Swift, these are `concurrent_gets` and `concurrency_timeout`.
`concurrent_gets` allows you to turn on or off concurrent
GETs; when on, it will set the GET/HEAD concurrency to the
replica count. And in the case of EC HEADs it will set it to
ndata. The proxy will then serve only the first valid source to
respond. This applies to all account, container, and replicated
object GETs and HEADs. For EC only HEAD requests are affected.
The default for `concurrent_gets` is off.
`concurrency_timeout` is related to `concurrent_gets` and is
the amount of time to wait before firing the next thread. A
value of 0 will fire at the same time (fully concurrent), but
setting another value will stagger the firing allowing you the
ability to give a node a short chance to respond before firing
the next. This value is a float and should be somewhere between
0 and `node_timeout`. The default is `conn_timeout`, meaning by
default it will stagger the firing.
* Added an operational procedures guide to the docs. It can be
found at http://swift.openstack.org/ops_runbook/index.html and
includes information on detecting and handling day-to-day
operational issues in a Swift cluster.
* Make `handoffs_first` a more useful mode for the object replicator.
The `handoffs_first` replication mode is used during periods of
problematic cluster behavior (e.g. full disks) when replication
needs to quickly drain partitions from a handoff node and move
them to a primary node.
Previously, `handoffs_first` would sort that handoff work before
"normal" replication jobs, but the normal replication work could
take quite some time and result in handoffs not being drained
quickly enough.
In order to focus on getting handoff partitions off the node
`handoffs_first` mode will now abort the current replication
sweep before attempting any primary suffix syncing if any of the
handoff partitions were not removed for any reason - and start
over with replication of handoffs jobs as the highest priority.
Note that `handoffs_first` being enabled will emit a warning on
start up, even if no handoff jobs fail, because of the negative
impact it can have during normal operations by dog-piling on a
node that was temporarily unavailable.
* By default, inbound `X-Timestamp` headers are now disallowed
(except when in an authorized container-sync request). This
header is useful for allowing data migration from other storage
systems to Swift and keeping the original timestamp of the data.
If you have this migration use case (or any other requirement on
allowing the clients to set an object's timestamp), set the
`shunt_inbound_x_timestamp` config variable to False in the
gatekeeper middleware config section of the proxy server config.
* Requesting a SLO manifest file with the query parameters
"?multipart-manifest=get&format=raw" will return the contents of
the manifest in the format as was originally sent by the client.
The "format=raw" is new.
* Static web page listings can now be rendered with a custom
label. By default listings are rendered with a label of:
"Listing of /v1/<account>/<container>/<path>". This change adds
a new custom metadata key/value pair
`X-Container-Meta-Web-Listings-Label: My Label` that when set,
will cause the following: "Listing of My Label/<path>" to be
rendered instead.
* Previously, static large objects (SLOs) had a minimum segment
size (default to 1MiB). This limit has been removed, but small
segments will be ratelimited. The config parameter
`rate_limit_under_size` controls the definition of "small"
segments (1MiB by default), and `rate_limit_segments_per_sec`
controls how many segments per second can be served (default is 1).
With the default values, the effective behavior is identical to the
previous behavior when serving SLOs.
* Container sync has been improved to perform a HEAD on the remote
side of the sync for each object being synced. If the object
exists on the remote side, container-sync will no longer
transfer the object, thus significantly lowering the network
requirements to use the feature.
* The object auditor will now clean up any old, stale rsync temp
files that it finds. These rsync temp files are left if the
rsync process fails without completing a full transfer of an
object. Since these files can be large, the temp files may end
up filling a disk. The new auditor functionality will reap these
rsync temp files if they are old. The new object-auditor config
variable `rsync_tempfile_timeout` is the number of seconds old a
tempfile must be before it is reaped. By default, this variable
is set to "auto" or the rsync_timeout plus 900 seconds (falling
back to a value of 1 day).
* The Erasure Code reconstruction process has been made more
efficient by not syncing data files when only the durable commit
file is missing.
* Fixed a bug where 304 and 416 response may not have the right
Etag and Accept-Ranges headers when the object is stored in an
Erasure Coded policy.
* Versioned writes now correctly stores the date of previous versions
using GMT instead of local time.
* The deprecated Keystone middleware option is_admin has been removed.
* Fixed log format in object auditor.
* The zero-byte mode (ZBF) of the object auditor will now properly
observe the `--once` option.
* Swift keeps track, internally, of "dirty" parts of the partition
keyspace with a "hashes.pkl" file. Operations on this file no
longer require a read-modify-write cycle and use a new
"hashes.invalid" file to track dirty partitions. This change
will improve end-user performance for PUT and DELETE operations.
* The object replicator's succeeded and failed counts are now logged.
* `swift-recon` can now query hosts by storage policy.
* The log_statsd_host value can now be an IPv6 address or a hostname
which only resolves to an IPv6 address.
* Erasure coded fragments now properly call fallocate to reserve disk
space before being written.
* Various other minor bug fixes and improvements.
swift (2.6.0)
* Dependency changes
- Updated minimum version of eventlet to 0.17.4 to support IPv6.
- Updated the minimum version of PyECLib to 1.0.7.
* The ring rebalancing algorithm was updated to better handle edge cases
and to give better (more balanced) rings in the general case. New rings
will have better initial placement, capacity adjustments will move less
data for better balance, and existing rings that were imbalanced should
start to become better balanced as they go through rebalance cycles.
* Added container and account reverse listings.
A GET request to an account or container resource with a "reverse=true"
query parameter will return the listing in reverse order. When
iterating over pages of reverse listings, the relative order of marker
and end_marker are swapped.
* Storage policies now support having more than one name.
This allows operators to fix a typo without breaking existing clients,
or, alternatively, have "short names" for policies. This is implemented
with the "aliases" config key in the storage policy config in
swift.conf. The aliases value is a list of names that the storage
policy may also be identified by. The storage policy "name" is used to
report the policy to users (eg in container headers). The aliases have
the same naming restrictions as the policy's primary name.
* The object auditor learned the "interval" config value to control the
time between each audit pass.
* `swift-recon --all` now includes the config checksum check.
* `swift-init` learned the --kill-after-timeout option to force a service
to quit (SIGKILL) after a designated time.
* `swift-recon` now correctly shows timestamps in UTC instead of local
time.
* Fixed bug where `swift-ring-builder` couldn't select device id 0.
* Documented the previously undocumented
`swift-ring-builder pretend_min_part_hours_passed` command.
* The "node_timeout" config value now accepts decimal values.
* `swift-ring-builder` now properly removes devices with zero weight.
* `swift-init` return codes are updated via "--strict" and "--non-strict"
options. Please see the usage string for more information.
* `swift-ring-builder` now reports the min_part_hours lockout time
remaining
* Container sync has been improved to more quickly find and iterate over
the containers to be synced. This reduced server load and lowers the
time required to see data propagate between two clusters. Please see
http://swift.openstack.org/overview_container_sync.html for more details
about the new on-disk structure for tracking synchronized containers.
* A container POST will now update that container's put-timestamp value.
* TempURL header restrictions are now exposed in /info.
* Error messages on static large object manifest responses have been
greatly improved.
* Closed a bug where an unfinished read of a large object would leak a
socket file descriptor and a small amount of memory. (CVE-2016-0738)
* Fixed an issue where a zero-byte object PUT with an incorrect Etag
would return a 503.
* Fixed an error when a static large object manifest references the same
object more than once.
* Improved performance of finding handoff nodes if a zone is empty.
* Fixed duplication of headers in Access-Control-Expose-Headers on CORS
requests.
* Fixed handling of IPv6 connections to memcache pools.
* Continued work towards python 3 compatibility.
* Various other minor bug fixes and improvements.
swift (2.5.0, OpenStack Liberty)
* Added the ability to specify ranges for Static Large Object (SLO)
segments.
* Replicator configs now support an "rsync_module" value to allow
for per-device rsync modules. This setting gives operators the
ability to fine-tune replication traffic in a Swift cluster and
isolate replication disk IO to a particular device. Please see
the docs and sample config files for more information and
examples.
* Significant work has gone in to testing, fixing, and validating
Swift's erasure code support at different scales.
* Swift now emits StatsD metrics on a per-policy basis.
* Fixed an issue with Keystone integration where a COPY request to a
service account may have succeeded even if a service token was not
included in the request.
* Ring validation now warns if a placement partition gets assigned to the
same device multiple times. This happens when devices in the ring are
unbalanced (e.g. two servers where one server has significantly more
available capacity).
* Various other minor bug fixes and improvements.
swift (2.4.0)
* Dependency changes
- Added six requirement. This is part of an ongoing effort to add
support for Python 3.
- Dropped support for Python 2.6.
* Config changes
- Recent versions of Python restrict the number of headers allowed in a
request to 100. This number may be too low for custom middleware. The
new "extra_header_count" config value in swift.conf can be used to
increase the number of headers allowed.
- Renamed "run_pause" setting to "interval" (current configs with
run_pause still work). Future versions of Swift may remove the
run_pause setting.
* Versioned writes middleware
The versioned writes feature has been refactored and reimplemented as
middleware. You should explicitly add the versioned_writes middleware to
your proxy pipeline, but do not remove or disable the existing container
server config setting ("allow_versions"), if it is currently enabled.
The existing container server config setting enables existing
containers to continue being versioned. Please see
http://swift.openstack.org/middleware.html#how-to-enable-object-versioning-in-a-swift-cluster
for further upgrade notes.
* Allow 1+ object-servers-per-disk deployment
Enabled by a new > 0 integer config value, "servers_per_port" in the
[DEFAULT] config section for object-server and/or replication server
configs. The setting's integer value determines how many different
object-server workers handle requests for any single unique local port
in the ring. In this mode, the parent swift-object-server process
continues to run as the original user (i.e. root if low-port binding
is required), binds to all ports as defined in the ring, and forks off
the specified number of workers per listen socket. The child, per-port
servers drop privileges and behave pretty much how object-server workers
always have, except that because the ring has unique ports per disk, the
object-servers will only be handling requests for a single disk. The
parent process detects dead servers and restarts them (with the correct
listen socket), starts missing servers when an updated ring file is
found with a device on the server with a new port, and kills extraneous
servers when their port is found to no longer be in the ring. The ring
files are stat'ed at most every "ring_check_interval" seconds, as
configured in the object-server config (same default of 15s).
In testing, this deployment configuration (with a value of 3) lowers
request latency, improves requests per second, and isolates slow disk
IO as compared to the existing "workers" setting. To use this, each
device must be added to the ring using a different port.
* Do container listing updates in another (green)thread
The object server has learned the "container_update_timeout" setting
(with a default of 1 second). This value is the number of seconds that
the object server will wait for the container server to update the
listing before returning the status of the object PUT operation.
Previously, the object server would wait up to 3 seconds for the
container server response. The new behavior dramatically lowers object
PUT latency when container servers in the cluster are busy (e.g. when
the container is very large). Setting the value too low may result in a
client PUT'ing an object and not being able to immediately find it in
listings. Setting it too high will increase latency for clients when
container servers are busy.
* TempURL fixes (closes CVE-2015-5223)
Do not allow PUT tempurls to create pointers to other data.
Specifically, disallow the creation of DLO object manifests via a PUT
tempurl. This prevents discoverability attacks which can use any PUT
tempurl to probe for private data by creating a DLO object manifest and
then using the PUT tempurl to head the object.
* Ring changes
- Partition placement no longer uses the port number to place
partitions. This improves dispersion in small clusters running one
object server per drive, and it does not affect dispersion in
clusters running one object server per server.
- Added ring-builder-analyzer tool to more easily test and analyze a
series of ring management operations.
- Stop moving partitions unnecessarily when overload is on.
* Significant improvements and bug fixes have been made to erasure code
support. This feature is suitable for beta testing, but it is not yet
ready for broad production usage.
* Bulk upload now treats user xattrs on files in the given archive as
object metadata on the resulting created objects.
* Emit warning log in object replicator if "handoffs_first" or
"handoff_delete" is set.
* Enable object replicator's failure count in swift-recon.
* Added storage policy support to dispersion tools.
* Support keystone v3 domains in swift-dispersion.
* Added domain_remap information to the /info endpoint.
* Added support for a "default_reseller_prefix" in domain_remap
middleware config.
* Allow SLO PUTs to forgo per-segment integrity checks. Previously, each
segment referenced in the manifest also needed the correct etag and
bytes setting. These fields now allow the "null" value to skip those
particular checks on the given segment.
* Allow rsync to use compression via a "rsync_compress" config. If set to
true, compression is only enabled for an rsync to a device in a
different region. In some cases, this can speed up cross-region
replication data transfer.
* Added time synchronization check in swift-recon (the --time option).
* The account reaper now runs faster on large accounts.
* Various other minor bug fixes and improvements.
swift (2.3.0, OpenStack Kilo)
* Erasure Code support (beta)
Swift now supports an erasure-code (EC) storage policy type. This allows
deployers to achieve very high durability with less raw capacity as used
in replicated storage. However, EC requires more CPU and network
resources, so it is not good for every use case. EC is great for storing
large, infrequently accessed data in a single region.
Swift's implementation of erasure codes is meant to be transparent to
end users. There is no API difference between replicated storage and
EC storage.
To support erasure codes, Swift now depends on PyECLib and
liberasurecode. liberasurecode is a pluggable library that allows for
the actual EC algorithm to be implemented in a library of your choosing.
As a beta release, EC support is nearly fully feature complete, but it
is lacking support for some features (like multi-range reads) and has
not had a full performance characterization. This feature relies on
ssync for durability. Deployers are urged to do extensive testing and
not deploy production data using an erasure code storage policy.
Full docs are at http://swift.openstack.org/overview_erasure_code.html
* Add support for container TempURL Keys.
* Make more memcache options configurable. connection_timeout,
pool_timeout, tries, and io_timeout are all now configurable.
* Swift now supports composite tokens. This allows another service to
act on behalf of a user, but only with that user's consent.
See http://swift.openstack.org/overview_auth.html for more details.
* Multi-region replication was improved. When replicating data to a
different region, only one replica will be pushed per replication
cycle. This gives the remote region a chance to replicate the data
locally instead of pushing more data over the inter-region network.
* Internal requests from the ratelimit middleware now properly log a
swift_source. See http://swift.openstack.org/logs.html for details.
* Improved storage policy support for quarantine stats in swift-recon.
* The proxy log line now includes the request's storage policy index.
* Ring checker has been added to swift-recon to validate if rings are
built correctly. As part of this feature, storage servers have learned
the OPTIONS verb.
* Add support of x-remove- headers for container-sync.
* Rings now support hostnames instead of just IP addresses.
* Swift now enforces that the API version on a request is valid. Valid
versions are configured via the valid_api_versions setting in swift.conf
* Various other minor bug fixes and improvements.
swift (2.2.2)
* Data placement changes
This release has several major changes to data placement in Swift in
order to better handle different deployment patterns. First, with an
unbalance-able ring, less partitions will move if the movement doesn't
result in any better dispersion across failure domains. Also, empty
(partition weight of zero) devices will no longer keep partitions after
rebalancing when there is an unbalance-able ring.
Second, the notion of "overload" has been added to Swift's rings. This
allows devices to take some extra partitions (more than would normally
be allowed by the device weight) so that smaller and unbalanced clusters
will have less data movement between servers, zones, or regions if there
is a failure in the cluster.
Finally, rings have a new metric called "dispersion". This is the
percentage of partitions in the ring that have too many replicas in a
particular failure domain. For example, if you have three servers in a
cluster but two replicas for a partition get placed onto the same
server, that partition will count towards the dispersion metric. A
lower value is better, and the value can be used to find the proper
value for "overload".
The overload and dispersion metrics have been exposed in the
swift-ring-build CLI tools.
See http://docs.openstack.org/developer/swift/overview_ring.html
for more info on how data placement works now.
* Improve replication of large out-of-sync, out-of-date containers.
* Added console logging to swift-drive-audit with a new log_to_console
config option (default False).
* Optimize replication when a device and/or partition is specified.
* Fix dynamic large object manifests getting versioned. This was not
intended and did not work. Now it is properly prevented.
* Fix the GET's response code when there is a missing segment in a
large object manifest.
* Change black/white listing in ratelimit middleware to use sysmeta.
Instead of using the config option, operators can set
"X-Account-Sysmeta-Global-Write-Ratelimit: WHITELIST" or
"X-Account-Sysmeta-Global-Write-Ratelimit: BLACKLIST" on an account to
whitelist or blacklist it for ratelimiting. Note: the existing
config options continue to work.
* Use TCP_NODELAY on outgoing connections.
* Improve object-replicator startup time.
* Implement OPTIONS verb for storage nodes.
* Various other minor bug fixes and improvements.
swift (2.2.1)
* Swift now rejects object names with Unicode surrogates.
* Return 403 (instead of 413) on unauthorized upload when over account
quota.
* Fix a rare condition when a rebalance could cause swift-ring-builder
to crash. This would only happen on old ring files when "rebalance"
was the first command run.
* Storage node error limits now survive a ring reload.
* Speed up reading and writing xattrs for object metadata by using larger
xattr value sizes. The change is moving from 254 byte values to 64KiB
values. There is no migration issue with this.
* Deleted containers beyond the reclaim age are now properly reclaimed.
* Full Simplified Chinese translation (zh_CN locale) for errors and logs.
* Container quota is now properly enforced during cross-account COPY.
* ssync replication now properly uses the configured replication_ip.
* Fixed issue were ssync did not replicate custom object headers.
* swift-drive-audit now has the 'unmount_failed_device' config option
(default to True) that controls if the process will unmount failed
drives or not.
* swift-drive-audit will now dump drive error rates to a recon file.
The file location is controlled by the 'recon_cache_path' config value
and it includes each drive and its associated number of errors.
* When a filesystem does't support xattr, the object server now returns
a 507 Insufficient Storage error to the proxy server.
* Clean up empty account and container partitions directories if they
are empty. This keeps the system healthy and prevents a large number
of empty directories from slowing down the replication process.
* Show the sum of every policy's amount of async pendings in swift-recon.
* Various other minor bug fixes and improvements.
```
there are jessie backports provided at http://liberty-jessie.pkgs.mirantis.com/ we could use/import. Dependencies to backport are relatively self contained: `python-eclib`, `liberasurecode`, `python-eventlet`Note that 2.10 at least fixes a bug introduced in 2.7 marked as "critical" https://bugs.launchpad.net/swift/+bug/1651530 so we might hold off on 2.7