Merge lp:~carlalex/duplicity/duplicity into lp:~duplicity-team/duplicity/0.8-series
- duplicity
- Merge into 0.8-series
Status: | Merged |
---|---|
Merged at revision: | 1516 |
Proposed branch: | lp:~carlalex/duplicity/duplicity |
Merge into: | lp:~duplicity-team/duplicity/0.8-series |
Diff against target: |
463 lines (+316/-3) 6 files modified
.bzrignore (+1/-0) bin/duplicity.1 (+101/-2) duplicity/backends/s3_boto3_backend.py (+205/-0) duplicity/commandline.py (+5/-1) duplicity/globals.py (+3/-0) requirements.txt (+1/-0) |
To merge this branch: | bzr merge lp:~carlalex/duplicity/duplicity |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
edso | Approve | ||
Review via email: mp+376206@code.launchpad.net |
Commit message
Boto3 backend for AWS.
Description of the change
Boto3 backend for AWS.
- 1525. By Carl A. Adams
-
merging from parent
Carl A. Adams (carlalex) wrote : | # |
I'll look into implementing boto3+s3. I have mixed feelings on it. On the plus side, I like that a URL should always behave one way. On the minus, I would think it someday should become the default for s3. But, implementing a non-default boto3+s3 now doesn't preclude changing the defaults in the future. I expect boto will die completely some day.
> looks good! and man page is adapted as well. i like it.
>
> there's just one issue that i'd like to change. switching actual backend via
> --parameter is deprecated since we adopted the stacked scheme. you can check
> eg.
>
> ssh_paramiko_
> ssh_pexpect_
>
> for implementations where two backends provide implementations for the same
> backends (sftp,scp).
> switching would then work via
>
> boto3+s3://
> while default will stay
> boto+s3// equaling s3://
>
> while i'd prefer the above i'd not insist on it, if you can't find te time.
> thanks for this contribution again!.. ede/duply.net
edso (ed.so) wrote : | # |
> On the minus, I would think it someday should become the default for s3. But, implementing a non-default boto3+s3 now doesn't preclude changing the defaults in the future. I expect boto will die completely some day.
that's the beauty of switching backends via scheme. moving the default later simply means add s3:// to te new default backend and remove it from the old, which then can still be selected by the prefixed scheme (if needed). eg.
moving
duplicity.
duplicity.
from paramiko back to pexpect would make the older legacy pexpect backend default again. simple as that.
..ede
- 1526. By Carl A. Adams
-
renaming boto3 backend file
- 1527. By Carl A. Adams
-
select boto3/s3 backend via url scheme rather than by CLI option. Doc changes to support this.
Carl A. Adams (carlalex) wrote : | # |
refactored to using scheme to select backend. Still should add some test code.
Carl A. Adams (carlalex) wrote : | # |
Not ready for merge. There are test failures in testing/
Carl A. Adams (carlalex) wrote : | # |
So, the test failures in manual/backendtest are in test_delete and test_list. I think the backend is actually listing and deleting as I would expect, but the test is failing due to a type mismatch. These tests as written are looking for files names b'a' and b'b', but list is returning then as regular (unicode?) strings, not byte strings.
In this case, I am not sure if the test is wrong, or if I should change the backend to return byte strings in list rather than unicode strings.
- 1528. By Carl A. Adams
-
Updating comments
Kenneth Loafman (kenneth-loafman) wrote : | # |
Yes, it needs to be bytes. Use util.fsencode() to convert.
On Sun, Dec 1, 2019 at 5:02 PM carlalex <email address hidden> wrote:
> So, the test failures in manual/backendtest are in test_delete and
> test_list. I think the backend is actually listing and deleting as I would
> expect, but the test is failing due to a type mismatch. These tests as
> written are looking for files names b'a' and b'b', but list is returning
> then as regular (unicode?) strings, not byte strings.
>
> In this case, I am not sure if the test is wrong, or if I should change
> the backend to return byte strings in list rather than unicode strings.
>
> --
> https:/
> You are subscribed to branch lp:duplicity.
>
edso (ed.so) wrote : | # |
hey Ken,
i see that for other backends this is done automagically in
https:/
can any of you see a reason why it is not for the boto3 backend?.. ede
On 02.12.2019 15:12, Kenneth Loafman wrote:
> Yes, it needs to be bytes. Use util.fsencode() to convert.
>
>
>
> On Sun, Dec 1, 2019 at 5:02 PM carlalex <email address hidden> wrote:
>
>> So, the test failures in manual/backendtest are in test_delete and
>> test_list. I think the backend is actually listing and deleting as I would
>> expect, but the test is failing due to a type mismatch. These tests as
>> written are looking for files names b'a' and b'b', but list is returning
>> then as regular (unicode?) strings, not byte strings.
>>
>> In this case, I am not sure if the test is wrong, or if I should change
>> the backend to return byte strings in list rather than unicode strings.
>>
>> --
>> https:/
>> You are subscribed to branch lp:duplicity.
>>
>
- 1529. By Carl A. Adams
-
BUGFIX: list should retun byte strings, not unicode strings
Kenneth Loafman (kenneth-loafman) wrote : | # |
Don't know. I know that duplicity requires bytes. I have not used the
manual backend test in a while, so it may be out of date.
On Mon, Dec 2, 2019 at 8:21 AM edso <email address hidden> wrote:
> hey Ken,
>
> i see that for other backends this is done automagically in
>
> https:/
>
> can any of you see a reason why it is not for the boto3 backend?.. ede
>
>
> On 02.12.2019 15:12, Kenneth Loafman wrote:
> > Yes, it needs to be bytes. Use util.fsencode() to convert.
> >
> >
> >
> > On Sun, Dec 1, 2019 at 5:02 PM carlalex <email address hidden> wrote:
> >
> >> So, the test failures in manual/backendtest are in test_delete and
> >> test_list. I think the backend is actually listing and deleting as I
> would
> >> expect, but the test is failing due to a type mismatch. These tests as
> >> written are looking for files names b'a' and b'b', but list is returning
> >> then as regular (unicode?) strings, not byte strings.
> >>
> >> In this case, I am not sure if the test is wrong, or if I should change
> >> the backend to return byte strings in list rather than unicode strings.
> >>
> >> --
> >> https:/
> >> You are subscribed to branch lp:duplicity.
> >>
> >
>
>
> --
> https:/
> You are subscribed to branch lp:duplicity.
>
edso (ed.so) wrote : | # |
maybe this helps
https:/
should probably use 'get_backend' instead of 'get_backend_
https:/
which properly wraps the backend in BackendWrapper()
like it is done in current duplicity, see
https:/
..ede/duply.net
On 02.12.2019 17:57, Kenneth Loafman wrote:
> Don't know. I know that duplicity requires bytes. I have not used the
> manual backend test in a while, so it may be out of date.
>
> On Mon, Dec 2, 2019 at 8:21 AM edso <email address hidden> wrote:
>
>> hey Ken,
>>
>> i see that for other backends this is done automagically in
>>
>> https:/
>>
>> can any of you see a reason why it is not for the boto3 backend?.. ede
>>
>>
>> On 02.12.2019 15:12, Kenneth Loafman wrote:
>>> Yes, it needs to be bytes. Use util.fsencode() to convert.
>>>
>>>
>>>
>>> On Sun, Dec 1, 2019 at 5:02 PM carlalex <email address hidden> wrote:
>>>
>>>> So, the test failures in manual/backendtest are in test_delete and
>>>> test_list. I think the backend is actually listing and deleting as I
>> would
>>>> expect, but the test is failing due to a type mismatch. These tests as
>>>> written are looking for files names b'a' and b'b', but list is returning
>>>> then as regular (unicode?) strings, not byte strings.
>>>>
>>>> In this case, I am not sure if the test is wrong, or if I should change
>>>> the backend to return byte strings in list rather than unicode strings.
>>>>
>>>> --
>>>> https:/
>>>> You are subscribed to branch lp:duplicity.
>>>>
>>>
>>
>>
>> --
>> https:/
>> You are subscribed to branch lp:duplicity.
>>
>
Kenneth Loafman (kenneth-loafman) wrote : | # |
Looks like manual.backendtest is way out of date. I think this needs to be moved to functional tests and incorporated into tox testing.
Carl A. Adams (carlalex) wrote : | # |
> Looks like manual.backendtest is way out of date. I think this needs to be
> moved to functional tests and incorporated into tox testing.
Where does that leave the merge request? What else should be done?
edso (ed.so) wrote : | # |
On 03.12.2019 05:06, carlalex wrote:
>> Looks like manual.backendtest is way out of date. I think this needs to be
>> moved to functional tests and incorporated into tox testing.
>
> Where does that leave the merge request? What else should be done?
>
assuming you tested with a live duplicity as well and as other live backends returning the list as strings work properly, i'd say it's fine in this regard.
one thing still, using the prefixed scheme you should probably update man page section section 'Url Format' similar as it is already done for ssh backends.
also could you please add 'boto+s3' to botobackend.py (register_backend() & uses_netloc.
thanks! ..ede/duply.net
edso (ed.so) : | # |
Carl A. Adams (carlalex) wrote : | # |
Testing a live backup against S3, I don't see any difference in list returning strings or byte strings. The only difference I've seen is in the manual test.
I had added register_backend to s3_boto3_backend, keeping it self contained and following the convention in the ssh backends. The backends do not seem entirely consistent on this point, with ssh having the two flavors entirely self contained, and boto and cf separating the implementation from the registration. The primary advantage of separating the registration from the implementation of the backend appears to be selecting backend implementation by CLI option, which I was told was now discouraged. FWIW, I'd say "boto" is not correct for this new backend anyway, since boto3 is really a completely different library, which can coexist with boto in a project. If we do want to register the new backend along side the older s3 backends in a common location, I'd suggest something named "s3" over "boto", reflecting the backup server type rather than the particular implementation.
I didn't register a netloc. Per the comments in backend.py, that didn't seem correct since the new backend doesn't have a network location. The new URL follows the behavior of the older "s3+http", which is also not in the netloc list.
I had already added boto3+s3 to the url scheme section and an extended explanation under "A note on amazon s3" in my latest updates, so I'm not sure what else you are asking for. Is the bzr merge request not up to date?
- 1530. By Carl A. Adams
-
Update to manpage
edso (ed.so) wrote : | # |
> Testing a live backup against S3, I don't see any difference in list returning
> strings or byte strings. The only difference I've seen is in the manual test.
that's because in live duplicity the backend are wrapped in BackendWrapper class. see comment https:/
> I had added register_backend to s3_boto3_backend, keeping it self contained
> and following the convention in the ssh backends. The backends do not seem
> entirely consistent on this point, with ssh having the two flavors entirely
> self contained, and boto and cf separating the implementation from the
> registration.
they are not. they were implemented before we switched to the prefixed scheme approach.
>The primary advantage of separating the registration from the
> implementation of the backend appears to be selecting backend implementation
> by CLI option, which I was told was now discouraged. FWIW, I'd say "boto"
> is not correct for this new backend anyway, since boto3 is really a completely
> different library, which can coexist with boto in a project. If we do want to
> register the new backend along side the older s3 backends in a common
> location, I'd suggest something named "s3" over "boto", reflecting the backup
> server type rather than the particular implementation.
not sure what you mean here.
>
> I didn't register a netloc. Per the comments in backend.py, that didn't seem
> correct since the new backend doesn't have a network location. The new URL
> follows the behavior of the older "s3+http", which is also not in the netloc
> list.
i was talking about the older boto backend here (note: i wrote botobackend.py) . your implementation seems not to use netloc indeed.
> I had already added boto3+s3 to the url scheme section and an extended
> explanation under "A note on amazon s3" in my latest updates, so I'm not sure
> what else you are asking for. Is the bzr merge request not up to date?
in the man page there is a section 'Url Format' that explains the url formats per backend (meaning protocol) currently it looks like
-->
URL Format
Duplicity uses the URL format (as standard as possible) to define data locations. The generic format for a URL is:
scheme:
It is not ....
[SNIP]
S3 storage (Amazon)
s3://host[
s3+http://
See also A NOTE ON EUROPEAN S3 BUCKETS
SCP/SFTP access
scp://.. or
sftp://
defaults are paramiko+scp:// and paramiko+sftp://
alternatively try pexpect+scp://, pexpect+sftp://, lftp+sftp://
See also --ssh-askpass, --ssh-options and A NOTE ON SSH BACKENDS.
[SNIP]
<--
see how the alternate backends are documented for scp/sftp? same would be advisable for s3, now that we have two backend (implementations) that provide S3 access.
if you don't want to touch the older botobackend.py i'm fine with that of course.
thanks! ..ede/duply.net
Carl A. Adams (carlalex) wrote : | # |
> > I had added register_backend to s3_boto3_backend, keeping it self contained
> > and following the convention in the ssh backends. The backends do not seem
> > entirely consistent on this point, with ssh having the two flavors entirely
> > self contained, and boto and cf separating the implementation from the
> > registration.
>
> they are not. they were implemented before we switched to the prefixed scheme
> approach.
>
> >The primary advantage of separating the registration from the
> > implementation of the backend appears to be selecting backend implementation
> > by CLI option, which I was told was now discouraged. FWIW, I'd say "boto"
> > is not correct for this new backend anyway, since boto3 is really a
> completely
> > different library, which can coexist with boto in a project. If we do want
> to
> > register the new backend along side the older s3 backends in a common
> > location, I'd suggest something named "s3" over "boto", reflecting the
> backup
> > server type rather than the particular implementation.
>
> not sure what you mean here.
>
Two things: 1) registering in botobackend.py as requested seems to conflict with the request to follow the newer prefix conventions (where the example of SSH registers each in their own ssh_<backend>.py). I followed the SSH convention when I renamed it the new backend s3_boto3_
>
> > I had already added boto3+s3 to the url scheme section and an extended
> > explanation under "A note on amazon s3" in my latest updates, so I'm not
> sure
> > what else you are asking for. Is the bzr merge request not up to date?
>
> in the man page there is a section 'Url Format' that explains the url formats
> per backend (meaning protocol) currently it looks like
>
> -->
>
> URL Format
>
> Duplicity uses the URL format (as standard as possible) to define data
> locations. The generic format for a URL is:
> scheme:
> It is not ....
>
> [SNIP]
>
> S3 storage (Amazon)
>
> s3://host[
> s3+http://
> See also A NOTE ON EUROPEAN S3 BUCKETS
>
I think you are looking at an old diff. I updated that when I switched from --s3-use-boto3 to boto3+s3. Did I need to do more than push my change to my branch to update the merge request? (First project that I've used bzr in...)
Kenneth Loafman (kenneth-loafman) wrote : | # |
Yes, just push the changes and let us know.
Thanks for the fixes!
edso (ed.so) wrote : | # |
On 04.12.2019 16:49, Carl A. Adams wrote:
>>> I had added register_backend to s3_boto3_backend, keeping it self contained
>>> and following the convention in the ssh backends. The backends do not seem
>>> entirely consistent on this point, with ssh having the two flavors entirely
>>> self contained, and boto and cf separating the implementation from the
>>> registration.
>>
>> they are not. they were implemented before we switched to the prefixed scheme
>> approach.
>>
>>> The primary advantage of separating the registration from the
>>> implementation of the backend appears to be selecting backend implementation
>>> by CLI option, which I was told was now discouraged. FWIW, I'd say "boto"
>>> is not correct for this new backend anyway, since boto3 is really a
>> completely
>>> different library, which can coexist with boto in a project. If we do want
>> to
>>> register the new backend along side the older s3 backends in a common
>>> location, I'd suggest something named "s3" over "boto", reflecting the
>> backup
>>> server type rather than the particular implementation.
>>
>> not sure what you mean here.
>>
>
>
> Two things: 1) registering in botobackend.py as requested seems to conflict with the request to follow the newer prefix conventions (where the example of SSH registers each in their own ssh_<backend>.py). I followed the SSH convention when I renamed it the new backend s3_boto3_
>
no worries. i think we are still misunderstanding each other. doesn't matter though! just leave the botobackend as is and i'll do the changes when i find the time :)
>>
>>> I had already added boto3+s3 to the url scheme section and an extended
>>> explanation under "A note on amazon s3" in my latest updates, so I'm not
>> sure
>>> what else you are asking for. Is the bzr merge request not up to date?
>>
>> in the man page there is a section 'Url Format' that explains the url formats
>> per backend (meaning protocol) currently it looks like
>>
>> -->
>>
>> URL Format
>>
>> Duplicity uses the URL format (as standard as possible) to define data
>> locations. The generic format for a URL is:
>> scheme:
>> It is not ....
>>
>> [SNIP]
>>
>> S3 storage (Amazon)
>>
>> s3://host[
>> s3+http://
>> See also A NOTE ON EUROPEAN S3 BUCKETS
>>
>
>
> I think you are looking at an old diff. I updated that when I switched from --s3-use-boto3 to boto3+s3. Did I need to do more than push my change to my branch to update the merge request? (First project that I've used bzr in...)
>
>
>
ok, i see it now. fine by me then!.. ede/duply.net
edso (ed.so) : | # |
Carl A. Adams (carlalex) wrote : | # |
> no worries. i think we are still misunderstanding each other. doesn't matter
> though! just leave the botobackend as is and i'll do the changes when i find
> the time :)
>
That seems likely. It'll be apparent when I see the final change. Thanks for your time.
Carl A. Adams (carlalex) wrote : | # |
> Yes, just push the changes and let us know.
>
> Thanks for the fixes!
The dumb "I've never worked with bzr" question... What do I need to do other than have all the changes in my branch? There is already a merge request outstanding between my branch and lp:duplicity
Carl A. Adams (carlalex) wrote : | # |
Thanks for merging.
Preview Diff
1 | === modified file '.bzrignore' |
2 | --- .bzrignore 2019-11-24 17:00:02 +0000 |
3 | +++ .bzrignore 2019-12-04 06:04:10 +0000 |
4 | @@ -25,4 +25,5 @@ |
5 | testing/gnupg/.gpg-v21-migrated |
6 | testing/gnupg/S.* |
7 | testing/gnupg/private-keys-v1.d |
8 | +duplicity-venv |
9 | duplicity/backends/rclonebackend.py |
10 | |
11 | === modified file 'bin/duplicity.1' |
12 | --- bin/duplicity.1 2019-05-05 12:16:14 +0000 |
13 | +++ bin/duplicity.1 2019-12-04 06:04:10 +0000 |
14 | @@ -706,7 +706,7 @@ |
15 | Sets the update rate at which duplicity will output the upload progress |
16 | messages (requires |
17 | .BI --progress |
18 | -option). Default is to prompt the status each 3 seconds. |
19 | +option). Default is to print the status each 3 seconds. |
20 | |
21 | .TP |
22 | .BI "--rename " "<original path> <new path>" |
23 | @@ -738,6 +738,13 @@ |
24 | .B EUROPEAN S3 BUCKETS |
25 | section. |
26 | |
27 | +This option does not apply when using the newer boto3 backend, which |
28 | +does not create buckets. |
29 | + |
30 | +See also |
31 | +.B "A NOTE ON AMAZON S3" |
32 | +below. |
33 | + |
34 | .TP |
35 | .BI "--s3-unencrypted-connection" |
36 | Don't use SSL for connections to S3. |
37 | @@ -753,6 +760,12 @@ |
38 | increment files. Unless that is disabled, an observer will not be able to see |
39 | the file names or contents. |
40 | |
41 | +This option is not available when using the newer boto3 backend. |
42 | + |
43 | +See also |
44 | +.B "A NOTE ON AMAZON S3" |
45 | +below. |
46 | + |
47 | .TP |
48 | .BI "--s3-use-new-style" |
49 | When operating on Amazon S3 buckets, use new-style subdomain bucket |
50 | @@ -760,6 +773,13 @@ |
51 | is not backwards compatible if your bucket name contains upper-case |
52 | characters or other characters that are not valid in a hostname. |
53 | |
54 | +This option has no effect when using the newer boto3 backend, which |
55 | +will always use new style subdomain bucket naming. |
56 | + |
57 | +See also |
58 | +.B "A NOTE ON AMAZON S3" |
59 | +below. |
60 | + |
61 | .TP |
62 | .BI "--s3-use-rrs" |
63 | Store volumes using Reduced Redundancy Storage when uploading to Amazon S3. |
64 | @@ -796,6 +816,22 @@ |
65 | all other data is stored in S3 Glacier. |
66 | |
67 | .TP |
68 | +.BI "--s3-use-deep-archive" |
69 | +Store volumes using Glacier Deep Archive S3 when uploading to Amazon S3. This storage class |
70 | +has a lower cost of storage but a higher per-request cost along with delays |
71 | +of up to 48 hours from the time of retrieval request. This storage cost is |
72 | +calculated against a 180-day storage minimum. According to Amazon this storage is |
73 | +ideal for data archiving and long-term backup offering 99.999999999% durability. |
74 | +To restore a backup you will have to manually migrate all data stored on AWS |
75 | +Glacier Deep Archive back to Standard S3 and wait for AWS to complete the migration. |
76 | +.B Notice: |
77 | +Duplicity will store the manifest.gpg files from full and incremental backups on |
78 | +AWS S3 standard storage to allow quick retrieval for later incremental backups, |
79 | +all other data is stored in S3 Glacier Deep Archive. |
80 | + |
81 | +Glacier Deep Archive is only available when using the newer boto3 backend. |
82 | + |
83 | +.TP |
84 | .BI "--s3-use-multiprocessing" |
85 | Allow multipart volumne uploads to S3 through multiprocessing. This option |
86 | requires Python 2.6 and can be used to make uploads to S3 more efficient. |
87 | @@ -803,6 +839,13 @@ |
88 | uploaded in parallel. Useful if you want to saturate your bandwidth |
89 | or if large files are failing during upload. |
90 | |
91 | +This has no effect when using the newer boto3 backend. Boto3 always |
92 | +attempts to multiprocessing when it is believed it will be more efficient. |
93 | + |
94 | +See also |
95 | +.B "A NOTE ON AMAZON S3" |
96 | +below. |
97 | + |
98 | .TP |
99 | .BI "--s3-use-server-side-encryption" |
100 | Allow use of server side encryption in S3 |
101 | @@ -814,6 +857,12 @@ |
102 | to maximize the use of your bandwidth. For example, a chunk size of 10MB |
103 | with a volsize of 30MB will result in 3 chunks per volume upload. |
104 | |
105 | +This has no effect when using the newer boto3 backend. |
106 | + |
107 | +See also |
108 | +.B "A NOTE ON AMAZON S3" |
109 | +below. |
110 | + |
111 | .TP |
112 | .BI "--s3-multipart-max-procs" |
113 | Specify the maximum number of processes to spawn when performing a multipart |
114 | @@ -822,6 +871,12 @@ |
115 | required to ensure you don't overload your system while maximizing the use of |
116 | your bandwidth. |
117 | |
118 | +This has no effect when using the newer boto3 backend. |
119 | + |
120 | +See also |
121 | +.B "A NOTE ON AMAZON S3" |
122 | +below. |
123 | + |
124 | .TP |
125 | .BI "--s3-multipart-max-timeout" |
126 | You can control the maximum time (in seconds) a multipart upload can spend on |
127 | @@ -829,6 +884,12 @@ |
128 | hanging on multipart uploads or if you'd like to control the time variance |
129 | when uploading to S3 to ensure you kill connections to slow S3 endpoints. |
130 | |
131 | +This has no effect when using the newer boto3 backend. |
132 | + |
133 | +See also |
134 | +.B "A NOTE ON AMAZON S3" |
135 | +below. |
136 | + |
137 | .TP |
138 | .BI "--azure-blob-tier" |
139 | Standard storage tier used for backup files (Hot|Cool|Archive). |
140 | @@ -1259,10 +1320,14 @@ |
141 | s3://host[:port]/bucket_name[/prefix] |
142 | .br |
143 | s3+http://bucket_name[/prefix] |
144 | +.br |
145 | +boto3+s3://bucket_name[/prefix] |
146 | .PP |
147 | See also |
148 | +.B "A NOTE ON AMAZON S3" |
149 | +and |
150 | .B "A NOTE ON EUROPEAN S3 BUCKETS" |
151 | -.RE |
152 | +below. |
153 | .PP |
154 | .B "SCP/SFTP access" |
155 | .PP |
156 | @@ -1628,6 +1693,40 @@ |
157 | .IR |
158 | .RE |
159 | |
160 | +.SH A NOTE ON AMAZON S3 |
161 | +When backing up to Amazon S3, two backend implementations are available. |
162 | +The schemes "s3" and "s3+http" are implemented using the older boto library, |
163 | +which has been deprecated and is no longer supported. The "boto3+s3" scheme |
164 | +is based on the newer boto3 library. This new backend fixes several known |
165 | +limitations in the older backend, which have crept in as |
166 | +Amazon S3 has evolved while the deprecated boto library has not kept up. |
167 | + |
168 | +The boto3 backend should behave largely the same as the older S3 backend, |
169 | +but there are some differences in the handling of some of the "S3" options. |
170 | +Additionally, there are some compatibility differences with the new backed. |
171 | +Because of these reasons, both backends have been retained for the time being. |
172 | +See the documentation for specific options regarding differences related to |
173 | +each backend. |
174 | + |
175 | +The boto3 backend does not support bucket creation. |
176 | +This is a deliberate choice which simplifies the code, and side steps |
177 | +problems related to region selection. Additionally, it is probably |
178 | +not a good practice to give your backup role bucket creation rights. |
179 | +In most cases the role used for backups should probably be |
180 | +limited to specific buckets. |
181 | + |
182 | +The boto3 backend only supports newer domain style buckets. Amazon is moving |
183 | +to deprecate the older bucket style, so migration is recommended. |
184 | +Use the older s3 backend for compatibility with backups stored in |
185 | +buckets using older naming conventions. |
186 | + |
187 | +The boto3 backend does not currently support initiating restores |
188 | +from the glacier storage class. When restoring a backup from |
189 | +glacier or glacier deep archive, the backup files must first be |
190 | +restored out of band. There are multiple options when restoring |
191 | +backups from cold storage, which vary in both cost and speed. |
192 | +See Amazon's documentation for details. |
193 | + |
194 | .SH A NOTE ON AZURE ACCESS |
195 | The Azure backend requires the Microsoft Azure Storage SDK for Python to be |
196 | installed on the system. |
197 | |
198 | === added file 'duplicity/backends/s3_boto3_backend.py' |
199 | --- duplicity/backends/s3_boto3_backend.py 1970-01-01 00:00:00 +0000 |
200 | +++ duplicity/backends/s3_boto3_backend.py 2019-12-04 06:04:10 +0000 |
201 | @@ -0,0 +1,205 @@ |
202 | +# -*- Mode:Python; indent-tabs-mode:nil; tab-width:4 -*- |
203 | +# |
204 | +# Copyright 2002 Ben Escoto <ben@emerose.org> |
205 | +# Copyright 2007 Kenneth Loafman <kenneth@loafman.com> |
206 | +# Copyright 2019 Carl A. Adams <carlalex@overlords.com> |
207 | +# |
208 | +# This file is part of duplicity. |
209 | +# |
210 | +# Duplicity is free software; you can redistribute it and/or modify it |
211 | +# under the terms of the GNU General Public License as published by the |
212 | +# Free Software Foundation; either version 2 of the License, or (at your |
213 | +# option) any later version. |
214 | +# |
215 | +# Duplicity is distributed in the hope that it will be useful, but |
216 | +# WITHOUT ANY WARRANTY; without even the implied warranty of |
217 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU |
218 | +# General Public License for more details. |
219 | +# |
220 | +# You should have received a copy of the GNU General Public License |
221 | +# along with duplicity; if not, write to the Free Software Foundation, |
222 | +# Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA |
223 | + |
224 | +import duplicity.backend |
225 | +from duplicity import globals |
226 | +from duplicity import log |
227 | +from duplicity.errors import FatalBackendException, BackendException |
228 | +from duplicity import util |
229 | +from duplicity import progress |
230 | + |
231 | + |
232 | +# Note: current gaps with the old boto backend include: |
233 | +# - Glacier restore to S3 not implemented. Should this |
234 | +# be done here? Or is that out of scope. My current opinion |
235 | +# is that it is out of scope, and the manpage reflects this. |
236 | +# It can take days, so waiting seems like it's not ideal. |
237 | +# "Thaw" isn't currently a generic concept that the core asks |
238 | +# of back-ends. Perhaps that is worth exploring. The older |
239 | +# boto backend appeared to attempt this restore in the code, |
240 | +# but the man page indicated that restores should be done out |
241 | +# of band. If implemented, We should add the the following |
242 | +# new features: |
243 | +# - when restoring from glacier or deep archive, specify TTL. |
244 | +# - allow user to specify how fast to restore (impacts cost). |
245 | + |
246 | +class S3Boto3Backend(duplicity.backend.Backend): |
247 | + u""" |
248 | + Backend for Amazon's Simple Storage System, (aka Amazon S3), though |
249 | + the use of the boto3 module. (See |
250 | + https://boto3.amazonaws.com/v1/documentation/api/latest/index.html |
251 | + for information on boto3.) |
252 | +. |
253 | + Pursuant to Amazon's announced deprecation of path style S3 access, |
254 | + this backend only supports virtual host style bucket URIs. |
255 | + See the man page for full details. |
256 | + |
257 | + To make use of this backend, you must provide AWS credentials. |
258 | + This may be done in several ways: through the environment variables |
259 | + AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, by the |
260 | + ~/.aws/credentials file, by the ~/.aws/config file, |
261 | + or by using the boto2 style ~/.boto or /etc/boto.cfg files. |
262 | + """ |
263 | + |
264 | + def __init__(self, parsed_url): |
265 | + duplicity.backend.Backend.__init__(self, parsed_url) |
266 | + |
267 | + # This folds the null prefix and all null parts, which means that: |
268 | + # //MyBucket/ and //MyBucket are equivalent. |
269 | + # //MyBucket//My///My/Prefix/ and //MyBucket/My/Prefix are equivalent. |
270 | + url_path_parts = [x for x in parsed_url.path.split(u'/') if x != u''] |
271 | + if url_path_parts: |
272 | + self.bucket_name = url_path_parts.pop(0) |
273 | + else: |
274 | + raise BackendException(u'S3 requires a bucket name.') |
275 | + |
276 | + if url_path_parts: |
277 | + self.key_prefix = u'%s/' % u'/'.join(url_path_parts) |
278 | + else: |
279 | + self.key_prefix = u'' |
280 | + |
281 | + self.parsed_url = parsed_url |
282 | + self.straight_url = duplicity.backend.strip_auth_from_url(parsed_url) |
283 | + self.s3 = None |
284 | + self.bucket = None |
285 | + self.tracker = UploadProgressTracker() |
286 | + self.reset_connection() |
287 | + |
288 | + def reset_connection(self): |
289 | + import boto3 |
290 | + import botocore |
291 | + from botocore.exceptions import ClientError |
292 | + |
293 | + self.bucket = None |
294 | + self.s3 = boto3.resource('s3') |
295 | + |
296 | + try: |
297 | + self.s3.meta.client.head_bucket(Bucket=self.bucket_name) |
298 | + except botocore.exceptions.ClientError as bce: |
299 | + error_code = bce.response['Error']['Code'] |
300 | + if error_code == '404': |
301 | + raise FatalBackendException(u'S3 bucket "%s" does not exist' % self.bucket_name, |
302 | + code=log.ErrorCode.backend_not_found) |
303 | + else: |
304 | + raise |
305 | + |
306 | + self.bucket = self.s3.Bucket(self.bucket_name) # only set if bucket is thought to exist. |
307 | + |
308 | + def _put(self, local_source_path, remote_filename): |
309 | + remote_filename = util.fsdecode(remote_filename) |
310 | + key = self.key_prefix + remote_filename |
311 | + |
312 | + if globals.s3_use_rrs: |
313 | + storage_class = u'REDUCED_REDUNDANCY' |
314 | + elif globals.s3_use_ia: |
315 | + storage_class = u'STANDARD_IA' |
316 | + elif globals.s3_use_onezone_ia: |
317 | + storage_class = u'ONEZONE_IA' |
318 | + elif globals.s3_use_glacier and u"manifest" not in remote_filename: |
319 | + storage_class = u'GLACIER' |
320 | + elif globals.s3_use_deep_archive and u"manifest" not in remote_filename: |
321 | + storage_class = u'DEEP_ARCHIVE' |
322 | + else: |
323 | + storage_class = u'STANDARD' |
324 | + extra_args = {u'StorageClass': storage_class} |
325 | + |
326 | + if globals.s3_use_sse: |
327 | + extra_args[u'ServerSideEncryption'] = u'AES256' |
328 | + elif globals.s3_use_sse_kms: |
329 | + if globals.s3_kms_key_id is None: |
330 | + raise FatalBackendException(u"S3 USE SSE KMS was requested, but key id not provided " |
331 | + u"require (--s3-kms-key-id)", |
332 | + code=log.ErrorCode.s3_kms_no_id) |
333 | + extra_args[u'ServerSideEncryption'] = u'aws:kms' |
334 | + extra_args[u'SSEKMSKeyId'] = globals.s3_kms_key_id |
335 | + if globals.s3_kms_grant: |
336 | + extra_args[u'GrantFullControl'] = globals.s3_kms_grant |
337 | + |
338 | + # Should the tracker be scoped to the put or the backend? |
339 | + # The put seems right to me, but the results look a little more correct |
340 | + # scoped to the backend. This brings up questions about knowing when |
341 | + # it's proper for it to be reset. |
342 | + # tracker = UploadProgressTracker() # Scope the tracker to the put() |
343 | + tracker = self.tracker |
344 | + |
345 | + log.Info(u"Uploading %s/%s to %s Storage" % (self.straight_url, remote_filename, storage_class)) |
346 | + self.s3.Object(self.bucket.name, key).upload_file(local_source_path.uc_name, |
347 | + Callback=tracker.progress_cb, |
348 | + ExtraArgs=extra_args) |
349 | + |
350 | + def _get(self, remote_filename, local_path): |
351 | + remote_filename = util.fsdecode(remote_filename) |
352 | + key = self.key_prefix + remote_filename |
353 | + self.s3.Object(self.bucket.name, key).download_file(local_path.uc_name) |
354 | + |
355 | + def _list(self): |
356 | + filename_list = [] |
357 | + for obj in self.bucket.objects.filter(Prefix=self.key_prefix): |
358 | + try: |
359 | + filename = obj.key.replace(self.key_prefix, u'', 1) |
360 | + filename_list.append(util.fsencode(filename)) |
361 | + log.Debug(u"Listed %s/%s" % (self.straight_url, filename)) |
362 | + except AttributeError: |
363 | + pass |
364 | + return filename_list |
365 | + |
366 | + def _delete(self, remote_filename): |
367 | + remote_filename = util.fsdecode(remote_filename) |
368 | + key = self.key_prefix + remote_filename |
369 | + self.s3.Object(self.bucket.name, key).delete() |
370 | + |
371 | + def _query(self, remote_filename): |
372 | + import botocore |
373 | + from botocore.exceptions import ClientError |
374 | + |
375 | + remote_filename = util.fsdecode(remote_filename) |
376 | + key = self.key_prefix + remote_filename |
377 | + content_length = -1 |
378 | + try: |
379 | + s3_obj = self.s3.Object(self.bucket.name, key) |
380 | + s3_obj.load() |
381 | + content_length = s3_obj.content_length |
382 | + except botocore.exceptions.ClientError as bce: |
383 | + if bce.response['Error']['Code'] == '404': |
384 | + pass |
385 | + else: |
386 | + raise |
387 | + return {u'size': content_length} |
388 | + |
389 | + |
390 | +class UploadProgressTracker(object): |
391 | + def __init__(self): |
392 | + self.total_bytes = 0 |
393 | + |
394 | + def progress_cb(self, fresh_byte_count): |
395 | + self.total_bytes += fresh_byte_count |
396 | + progress.report_transfer(self.total_bytes, 0) # second arg appears to be unused |
397 | + # It would seem to me that summing progress should be the callers job, |
398 | + # and backends should just toss bytes written numbers over the fence. |
399 | + # But, the progress bar doesn't work in a reasonable way when we do |
400 | + # that. (This would also eliminate the need for this class to hold |
401 | + # the scoped rolling total.) |
402 | + # progress.report_transfer(fresh_byte_count, 0) |
403 | + |
404 | + |
405 | +duplicity.backend.register_backend(u"boto3+s3", S3Boto3Backend) |
406 | +# duplicity.backend.uses_netloc.extend([u'boto3+s3']) |
407 | |
408 | === modified file 'duplicity/commandline.py' |
409 | --- duplicity/commandline.py 2019-11-24 17:00:02 +0000 |
410 | +++ duplicity/commandline.py 2019-12-04 06:04:10 +0000 |
411 | @@ -506,7 +506,7 @@ |
412 | # support european for now). |
413 | parser.add_option(u"--s3-european-buckets", action=u"store_true") |
414 | |
415 | - # Whether to use S3 Reduced Redudancy Storage |
416 | + # Whether to use S3 Reduced Redundancy Storage |
417 | parser.add_option(u"--s3-use-rrs", action=u"store_true") |
418 | |
419 | # Whether to use S3 Infrequent Access Storage |
420 | @@ -515,6 +515,9 @@ |
421 | # Whether to use S3 Glacier Storage |
422 | parser.add_option(u"--s3-use-glacier", action=u"store_true") |
423 | |
424 | + # Whether to use S3 Glacier Deep Archive Storage |
425 | + parser.add_option(u"--s3-use-deep-archive", action=u"store_true") |
426 | + |
427 | # Whether to use S3 One Zone Infrequent Access Storage |
428 | parser.add_option(u"--s3-use-onezone-ia", action=u"store_true") |
429 | |
430 | @@ -948,6 +951,7 @@ |
431 | rsync://%(user)s[:%(password)s]@%(other_host)s[:%(port)s]//%(absolute_path)s |
432 | s3://%(other_host)s[:%(port)s]/%(bucket_name)s[/%(prefix)s] |
433 | s3+http://%(bucket_name)s[/%(prefix)s] |
434 | + boto3+s3://%(bucket_name)s[/%(prefix)s] |
435 | scp://%(user)s[:%(password)s]@%(other_host)s[:%(port)s]/%(some_dir)s |
436 | ssh://%(user)s[:%(password)s]@%(other_host)s[:%(port)s]/%(some_dir)s |
437 | swift://%(container_name)s |
438 | |
439 | === modified file 'duplicity/globals.py' |
440 | --- duplicity/globals.py 2019-05-17 16:41:49 +0000 |
441 | +++ duplicity/globals.py 2019-12-04 06:04:10 +0000 |
442 | @@ -200,6 +200,9 @@ |
443 | # Whether to use S3 Glacier Storage |
444 | s3_use_glacier = False |
445 | |
446 | +# Whether to use S3 Glacier Deep Archive Storage |
447 | +s3_use_deep_archive = False |
448 | + |
449 | # Whether to use S3 One Zone Infrequent Access Storage |
450 | s3_use_onezone_ia = False |
451 | |
452 | |
453 | === modified file 'requirements.txt' |
454 | --- requirements.txt 2019-11-16 17:15:49 +0000 |
455 | +++ requirements.txt 2019-12-04 06:04:10 +0000 |
456 | @@ -26,6 +26,7 @@ |
457 | # azure |
458 | # b2sdk |
459 | # boto |
460 | +# boto3 |
461 | # dropbox==6.9.0 |
462 | # gdata |
463 | # jottalib |
looks good! and man page is adapted as well. i like it.
there's just one issue that i'd like to change. switching actual backend via --parameter is deprecated since we adopted the stacked scheme. you can check eg.
ssh_paramiko_ backend. py backend. py
ssh_pexpect_
for implementations where two backends provide implementations for the same backends (sftp,scp).
switching would then work via
boto3+s3://
while default will stay
boto+s3// equaling s3://
while i'd prefer the above i'd not insist on it, if you can't find te time. thanks for this contribution again!.. ede/duply.net