1 // A Volume is an interface representing a Keep back-end storage unit:
2 // for example, a single mounted disk, a RAID array, an Amazon S3 volume,
13 type Volume interface {
14 // Get a block. IFF the returned error is nil, the caller must
15 // put the returned slice back into the buffer pool when it's
18 // loc is guaranteed to consist of 32 or more lowercase hex
21 // Get should not verify the integrity of the returned data:
22 // it should just return whatever was found in its backing
25 // If an error is encountered that prevents it from
26 // retrieving the data, that error should be returned so the
27 // caller can log (and send to the client) a more useful
30 // If the error is "not found", and there's no particular
31 // reason to expect the block to be found (other than that a
32 // caller is asking for it), the returned error should satisfy
33 // os.IsNotExist(err): this is a normal condition and will not
34 // be logged as an error (except that a 404 will appear in the
35 // access log if the block is not found on any other volumes
38 // If the data in the backing store is bigger than BLOCKSIZE,
39 // Get is permitted to return an error without reading any of
41 Get(loc string) ([]byte, error)
43 // Put writes a block to an underlying storage device.
45 // loc is as described in Get.
47 // len(block) is guaranteed to be between 0 and BLOCKSIZE.
49 // If a block is already stored under the same name (loc) with
50 // different content, Put must either overwrite the existing
51 // data with the new data or return a non-nil error.
53 // Put also sets the timestamp for the given locator to the
56 // Put must return a non-nil error unless it can guarantee
57 // that the entire block has been written and flushed to
58 // persistent storage, and that its timestamp is current. Of
59 // course, this guarantee is only as good as the underlying
60 // storage device, but it is Put's responsibility to at least
61 // get whatever guarantee is offered by the storage device.
63 // Put should not verify that loc==hash(block): this is the
64 // caller's responsibility.
65 Put(loc string, block []byte) error
67 // Touch sets the timestamp for the given locator to the
70 // loc is as described in Get.
72 // Touch must return a non-nil error unless it can guarantee
73 // that a future call to Mtime() will return a timestamp newer
74 // than {now minus one second}.
75 Touch(loc string) error
77 // Mtime returns the stored timestamp for the given locator.
79 // loc is as described in Get.
81 // Mtime must return a non-nil error if the given block is not
82 // found or the timestamp could not be retrieved.
83 Mtime(loc string) (time.Time, error)
85 // IndexTo writes a complete list of locators with the given
86 // prefix for which Get() can retrieve data.
88 // prefix consists of zero or more lowercase hexadecimal
91 // Each locator must be written to the given writer using the
94 // loc "+" size " " timestamp "\n"
98 // - size is the number of bytes of content, given as a
99 // decimal number with one or more digits
101 // - timestamp is the timestamp stored for the locator,
102 // given as a decimal number of seconds after January 1,
105 // IndexTo must not write any other data to writer: for
106 // example, it must not write any blank lines.
108 // If an error makes it impossible to provide a complete
109 // index, IndexTo must return a non-nil error. It is
110 // acceptable to return a non-nil error after writing a
111 // partial index to writer.
113 // The resulting index is not expected to be sorted in any
115 IndexTo(prefix string, writer io.Writer) error
117 // Delete deletes the block data from the underlying storage
120 // loc is as described in Get.
122 // If the timestamp for the given locator is newer than
123 // blob_signature_ttl, Delete must not delete the data.
125 // If a Delete operation overlaps with any Touch or Put
126 // operations on the same locator, the implementation must
127 // ensure one of the following outcomes:
129 // - Touch and Put return a non-nil error, or
130 // - Delete does not delete the block, or
131 // - Both of the above.
133 // If it is possible for the storage device to be accessed by
134 // a different process or host, the synchronization mechanism
135 // should also guard against races with other processes and
136 // hosts. If such a mechanism is not available, there must be
137 // a mechanism for detecting unsafe configurations, alerting
138 // the operator, and aborting or falling back to a read-only
139 // state. In other words, running multiple keepstore processes
140 // with the same underlying storage device must either work
141 // reliably or fail outright.
143 // Corollary: A successful Touch or Put guarantees a block
144 // will not be deleted for at least blob_signature_ttl
146 Delete(loc string) error
148 // Status returns a *VolumeStatus representing the current
149 // in-use and available storage capacity and an
150 // implementation-specific volume identifier (e.g., "mount
151 // point" for a UnixVolume).
152 Status() *VolumeStatus
154 // String returns an identifying label for this volume,
155 // suitable for including in log messages. It should contain
156 // enough information to uniquely identify the underlying
157 // storage device, but should not contain any credentials or
161 // Writable returns false if all future Put, Mtime, and Delete
162 // calls are expected to fail.
164 // If the volume is only temporarily unwritable -- or if Put
165 // will fail because it is full, but Mtime or Delete can
166 // succeed -- then Writable should return false.
170 // A VolumeManager tells callers which volumes can read, which volumes
171 // can write, and on which volume the next write should be attempted.
172 type VolumeManager interface {
173 // AllReadable returns all volumes.
174 AllReadable() []Volume
175 // AllWritable returns all volumes that aren't known to be in
176 // a read-only state. (There is no guarantee that a write to
177 // one will succeed, though.)
178 AllWritable() []Volume
179 // NextWritable returns the volume where the next new block
180 // should be written. A VolumeManager can select a volume in
181 // order to distribute activity across spindles, fill up disks
182 // with more free space, etc.
183 NextWritable() Volume
184 // Close shuts down the volume manager cleanly.
188 type RRVolumeManager struct {
194 func MakeRRVolumeManager(volumes []Volume) *RRVolumeManager {
195 vm := &RRVolumeManager{}
196 for _, v := range volumes {
197 vm.readables = append(vm.readables, v)
199 vm.writables = append(vm.writables, v)
205 func (vm *RRVolumeManager) AllReadable() []Volume {
209 func (vm *RRVolumeManager) AllWritable() []Volume {
213 func (vm *RRVolumeManager) NextWritable() Volume {
214 if len(vm.writables) == 0 {
217 i := atomic.AddUint32(&vm.counter, 1)
218 return vm.writables[i%uint32(len(vm.writables))]
221 func (vm *RRVolumeManager) Close() {