diff -Nru datalad-0.12.4/CHANGELOG.md datalad-0.12.6/CHANGELOG.md --- datalad-0.12.4/CHANGELOG.md 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/CHANGELOG.md 2020-04-23 18:42:28.000000000 +0000 @@ -10,6 +10,84 @@ [DataLad git repository](http://github.com/datalad/datalad) for more details. +## 0.12.6 (April 23, 2020) -- . + +### Major refactoring and deprecations + +- The value of `datalad.support.annexrep.N_AUTO_JOBS` is no longer + considered. The variable will be removed in a later release. + ([#4409][]) + +### Fixes + +- Staring with v0.12.0, `datalad save` recorded the current branch of + a parent dataset as the `branch` value in the .gitmodules entry for + a subdataset. This behavior is problematic for a few reasons and + has been reverted. ([#4375][]) + +- The default for the `--jobs` option, "auto", instructed DataLad to + pass a value to git-annex's `--jobs` equal to `min(8, max(3, ))`, which could lead to issues due to the large number of + child processes spawned and file descriptors opened. To avoid this + behavior, `--jobs=auto` now results in git-annex being called with + `--jobs=1` by default. Configure the new option + `datalad.runtime.max-annex-jobs` to control the maximum value that + will be considered when `--jobs='auto'`. ([#4409][]) + +- Various commands have been adjusted to better handle the case where + a remote's HEAD ref points to an unborn branch. ([#4370][]) + +- [search] + - learned to use the query as a regular expression that restricts + the keys that are shown for `--show-keys short`. ([#4354][]) + - gives a more helpful message when query is an invalid regular + expression. ([#4398][]) + +- The code for parsing Git configuration did not follow Git's behavior + of accepting a key with no value as shorthand for key=true. ([#4421][]) + +- `AnnexRepo.info` needed a compatibility update for a change in how + git-annex reports file names. ([#4431][]) + +- [create-sibling-github][] did not gracefully handle a token that did + not have the necessary permissions. ([#4400][]) + +### Enhancements and new features + +- [search] learned to use the query as a regular expression that + restricts the keys that are shown for `--show-keys short`. ([#4354][]) + +- `datalad ` learned to point to the [datalad-container][] + extension when a subcommand from that extension is given but the + extension is not installed. ([#4400][]) ([#4174][]) + + +## 0.12.5 (Apr 02, 2020) -- a small step for datalad ... + +Fix some bugs and make the world an even better place. + +### Fixes + +- Our `log_progress` helper mishandled the initial display and step of + the progress bar. ([#4326][]) + +- `AnnexRepo.get_content_annexinfo` is designed to accept `init=None`, + but passing that led to an error. ([#4330][]) + +- Update a regular expression to handle an output change in Git + v2.26.0. ([#4328][]) + +- We now set `LC_MESSAGES` to 'C' while running git to avoid failures + when parsing output that is marked for translation. ([#4342][]) + +- The helper for decoding JSON streams loaded the last line of input + without decoding it if the line didn't end with a new line, a + regression introduced in the 0.12.0 release. ([#4361][]) + +- The clone command failed to git-annex-init a fresh clone whenever + it considered to add the origin of the origin as a remote. ([#4367][]) + + ## 0.12.4 (Mar 19, 2020) -- Windows?!  The main purpose of this release is to have one on PyPi that has no @@ -17,8 +95,10 @@ ### Fixes -- Adjust the behavior of the `log.outputs` config switch to make outputs - visible. Its description was adjusted accordingly. +- The description of the `log.outputs` config switch did not keep up + with code changes and incorrectly stated that the output would be + logged at the DEBUG level; logging actually happens at a lower + level. ([#4317][]) ## 0.12.3 (March 16, 2020) -- . @@ -2442,6 +2522,7 @@ [#4073]: https://github.com/datalad/datalad/issues/4073 [#4078]: https://github.com/datalad/datalad/issues/4078 [#4140]: https://github.com/datalad/datalad/issues/4140 +[#4174]: https://github.com/datalad/datalad/issues/4174 [#4194]: https://github.com/datalad/datalad/issues/4194 [#4200]: https://github.com/datalad/datalad/issues/4200 [#4212]: https://github.com/datalad/datalad/issues/4212 @@ -2451,3 +2532,18 @@ [#4285]: https://github.com/datalad/datalad/issues/4285 [#4308]: https://github.com/datalad/datalad/issues/4308 [#4315]: https://github.com/datalad/datalad/issues/4315 +[#4317]: https://github.com/datalad/datalad/issues/4317 +[#4326]: https://github.com/datalad/datalad/issues/4326 +[#4328]: https://github.com/datalad/datalad/issues/4328 +[#4330]: https://github.com/datalad/datalad/issues/4330 +[#4342]: https://github.com/datalad/datalad/issues/4342 +[#4354]: https://github.com/datalad/datalad/issues/4354 +[#4361]: https://github.com/datalad/datalad/issues/4361 +[#4367]: https://github.com/datalad/datalad/issues/4367 +[#4370]: https://github.com/datalad/datalad/issues/4370 +[#4375]: https://github.com/datalad/datalad/issues/4375 +[#4398]: https://github.com/datalad/datalad/issues/4398 +[#4400]: https://github.com/datalad/datalad/issues/4400 +[#4409]: https://github.com/datalad/datalad/issues/4409 +[#4421]: https://github.com/datalad/datalad/issues/4421 +[#4431]: https://github.com/datalad/datalad/issues/4431 diff -Nru datalad-0.12.4/CONTRIBUTING.md datalad-0.12.6/CONTRIBUTING.md --- datalad-0.12.4/CONTRIBUTING.md 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/CONTRIBUTING.md 2020-04-23 18:42:28.000000000 +0000 @@ -545,7 +545,7 @@ For the upcoming release use this template -## 0.12.5 (??? ??, 2020) -- will be better than ever +## 0.12.7 (??? ??, 2020) -- will be better than ever bet we will fix some bugs and make a world even a better place. diff -Nru datalad-0.12.4/CONTRIBUTORS datalad-0.12.6/CONTRIBUTORS --- datalad-0.12.4/CONTRIBUTORS 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/CONTRIBUTORS 2020-04-23 18:42:28.000000000 +0000 @@ -1,12 +1,15 @@ The following people have contributed to DataLad: +Adina Wagner Alejandro de la Vega Alex Waite Anisha Keshavan Benjamin Poldrack Christian Olaf Häusler +Christopher J. Markiewicz Dave MacFarlane Debanjum Singh Solanky +Elizabeth DuPre Feilong Ma Gergana Alteva Horea Christian @@ -14,9 +17,12 @@ Jorrit Poelen Kusti Skytén Kyle Meyer +Laura Waite Matteo Visconti dOC Michael Hanke Nell Hardcastle +Neuroimaging Community +Soichi Hayashi Taylor Olson Torsten Stoeter Vanessa Sochat diff -Nru datalad-0.12.4/datalad/cmd.py datalad-0.12.6/datalad/cmd.py --- datalad-0.12.4/datalad/cmd.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/cmd.py 2020-04-23 18:42:28.000000000 +0000 @@ -704,6 +704,14 @@ git_env['GIT_SSH_COMMAND'] = GIT_SSH_COMMAND git_env['GIT_SSH_VARIANT'] = 'ssh' + # We are parsing error messages and hints. For those to work more + # reliably we are doomed to sacrifice i18n effort of git, and enforce + # consistent language of the messages + git_env['LC_MESSAGES'] = 'C' + # But since LC_ALL takes precedence, over LC_MESSAGES, we cannot + # "leak" that one inside, and are doomed to pop it + git_env.pop('LC_ALL', None) + return git_env def run(self, cmd, env=None, *args, **kwargs): diff -Nru datalad-0.12.4/datalad/config.py datalad-0.12.6/datalad/config.py --- datalad-0.12.4/datalad/config.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/config.py 2020-04-23 18:42:28.000000000 +0000 @@ -93,7 +93,13 @@ if line.startswith('command line:'): # nothing we could handle continue - k, v = cfg_kv_regex.match(line).groups() + kv_match = cfg_kv_regex.match(line) + if kv_match: + k, v = kv_match.groups() + else: + # could be just a key without = value, which git treats as True + # if asked for a bool + k, v = line, None present_v = dct.get(k, None) if present_v is None: dct[k] = v @@ -542,6 +548,8 @@ TypeError is raised for other values. """ val = self.get_value(section, option, default=default) + if val is None: # no value at all, git treats it as True + return True return anything2bool(val) def getfloat(self, section, option): diff -Nru datalad-0.12.4/datalad/consts.py datalad-0.12.6/datalad/consts.py --- datalad-0.12.4/datalad/consts.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/consts.py 2020-04-23 18:42:28.000000000 +0000 @@ -69,6 +69,7 @@ # git/datalad configuration item to provide a token for github CONFIG_HUB_TOKEN_FIELD = 'hub.oauthtoken' +GITHUB_LOGIN_URL = 'https://github.com/login' # format of git-annex adjusted branch names ADJUSTED_BRANCH_EXPR = re.compile(r'^adjusted/(?P[^(]+)\(.*\)$') diff -Nru datalad-0.12.4/datalad/core/distributed/clone.py datalad-0.12.6/datalad/core/distributed/clone.py --- datalad-0.12.4/datalad/core/distributed/clone.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/core/distributed/clone.py 2020-04-23 18:42:28.000000000 +0000 @@ -451,6 +451,9 @@ # next candidate continue + if not cand.get("version"): + postclone_check_head(destds) + # perform any post-processing that needs to know details of the clone # source if cand['type'] == 'ria': @@ -506,6 +509,34 @@ yield get_status_dict(status='ok', **result_props) +def postclone_check_head(ds): + repo = ds.repo + if not repo.commit_exists("HEAD"): + # HEAD points to an unborn branch. A likely cause of this is that the + # remote's main branch is something other than master but HEAD wasn't + # adjusted accordingly. + # + # Let's choose the most recently updated remote ref (according to + # commit date). In the case of a submodule, switching to a ref with + # commits prevents .update_submodule() from failing. It is likely that + # the ref includes the registered commit, but we don't have the + # information here to know for sure. If it doesn't, .update_submodule() + # will check out a detached HEAD. + remote_branches = ( + b["refname:strip=2"] for b in repo.for_each_ref_( + fields="refname:strip=2", sort="-committerdate", + pattern="refs/remotes/origin")) + for rbranch in remote_branches: + if rbranch in ["origin/git-annex", "HEAD"]: + continue + repo.call_git(["checkout", "-b", rbranch[7:], # drop "origin/" + "--track", rbranch]) + lgr.debug("Checked out local branch from %s", rbranch) + return + lgr.warning("Cloned %s but could not find a branch " + "with commits", ds.path) + + def postclonecfg_ria(ds, props): """Configure a dataset freshly cloned from a RIA store""" # RIA uses hashdir mixed, copying data to it via git-annex (if cloned via @@ -564,10 +595,6 @@ ds.config.set( 'annex.hardlink', 'true', where='local', reload=True) - # we have just cloned the repo, so it has 'origin', configure any - # reachable origin of origins - yield from configure_origins(ds, ds) - lgr.debug("Initializing annex repo at %s", ds.path) # Note, that we cannot enforce annex-init via AnnexRepo(). # If such an instance already exists, its __init__ will not be executed. @@ -643,6 +670,11 @@ srs[False][0] if len(srs[False]) == 1 else "SIBLING", ) + # we have just cloned the repo, so it has 'origin', configure any + # reachable origin of origins + yield from configure_origins(ds, ds) + + _handle_possible_annex_dataset = postclonecfg_annexdataset diff -Nru datalad-0.12.4/datalad/core/distributed/tests/test_clone.py datalad-0.12.6/datalad/core/distributed/tests/test_clone.py --- datalad-0.12.4/datalad/core/distributed/tests/test_clone.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/core/distributed/tests/test_clone.py 2020-04-23 18:42:28.000000000 +0000 @@ -505,8 +505,13 @@ # Clone another level, this time with a relative path. Drop content from # lev2 so that origin is the only place that the file is available from. clone_lev2.drop("file1.txt") - with chpwd(path): + with chpwd(path), swallow_logs(new_level=9) as cml: clone_lev3 = clone('clone_lev2', 'clone_lev3') + # we called git-annex-init; see gh-4367: + cml.assert_logged(msg=r"[^[]*Running: \[('git', 'annex'|'git-annex'), " + r"'init'", + match=False, + level='Level 9') assert_result_count( clone_lev3.get('file1.txt', on_failure='ignore'), 1, @@ -763,3 +768,80 @@ ok_(ds.is_installed()) eq_(ds.id, datalad_store_testds_id) + +@with_tempfile(mkdir=True) +def test_clone_unborn_head(path): + ds_origin = Dataset(op.join(path, "a")).create() + repo = ds_origin.repo + managed = repo.is_managed_branch() + + # The setup below is involved, mostly because it's accounting for adjusted + # branches. The scenario itself isn't so complicated, though: + # + # * a checked out master branch with no commits + # * a (potentially adjusted) "abc" branch with commits. + # * a (potentially adjusted) "chooseme" branch whose tip commit has a + # more recent commit than any in "abc". + (ds_origin.pathobj / "foo").write_text("foo content") + ds_origin.save(message="foo") + for res in repo.for_each_ref_(fields="refname"): + ref = res["refname"] + if "master" in ref: + repo.update_ref(ref.replace("master", "abc"), ref) + repo.call_git(["update-ref", "-d", ref]) + repo.update_ref("HEAD", + "refs/heads/{}".format( + "adjusted/abc(unlocked)" if managed else "abc"), + symbolic=True) + repo.call_git(["checkout", "-b", "chooseme", "abc~1"]) + if managed: + repo.adjust() + (ds_origin.pathobj / "bar").write_text("bar content") + ds_origin.save(message="bar") + # Try to make the git-annex branch the most recently updated ref so that we + # test that it is skipped. + ds_origin.drop("bar", check=False) + ds_origin.repo.checkout("master", options=["--orphan"]) + + ds = clone(ds_origin.path, op.join(path, "b")) + # We landed on the branch with the most recent commit, ignoring the + # git-annex branch. + branch = ds.repo.get_active_branch() + eq_(ds.repo.get_corresponding_branch(branch) or branch, + "chooseme") + eq_(ds_origin.repo.get_hexsha("chooseme"), + ds.repo.get_hexsha("chooseme")) + # In the context of this test, the clone should be on an adjusted branch if + # the source landed there initially because we're on the same file system. + eq_(managed, ds.repo.is_managed_branch()) + + +@with_tempfile(mkdir=True) +def test_clone_unborn_head_no_other_ref(path): + # TODO: On master, update to use annex=False. + ds_origin = Dataset(op.join(path, "a")).create(no_annex=True) + ds_origin.repo.call_git(["update-ref", "-d", "refs/heads/master"]) + with swallow_logs(new_level=logging.WARNING) as cml: + clone(source=ds_origin.path, path=op.join(path, "b")) + assert_in("could not find a branch with commits", cml.out) + + +@with_tempfile(mkdir=True) +def test_clone_unborn_head_sub(path): + ds_origin = Dataset(op.join(path, "a")).create() + ds_origin_sub = Dataset(op.join(path, "a", "sub")).create() + managed = ds_origin_sub.repo.is_managed_branch() + ds_origin_sub.repo.call_git(["branch", "-m", "master", "other"]) + ds_origin.save() + ds_origin_sub.repo.checkout("master", options=["--orphan"]) + + ds_cloned = clone(source=ds_origin.path, path=op.join(path, "b")) + ds_cloned_sub = ds_cloned.get( + "sub", result_xfm="datasets", return_type="item-or-list") + + branch = ds_cloned_sub.repo.get_active_branch() + eq_(ds_cloned_sub.repo.get_corresponding_branch(branch) or branch, + "other") + # In the context of this test, the clone should be on an adjusted branch if + # the source landed there initially because we're on the same file system. + eq_(managed, ds_cloned_sub.repo.is_managed_branch()) diff -Nru datalad-0.12.4/datalad/core/local/create.py datalad-0.12.6/datalad/core/local/create.py --- datalad-0.12.4/datalad/core/local/create.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/core/local/create.py 2020-04-23 18:42:28.000000000 +0000 @@ -365,7 +365,7 @@ tbrepo.set_default_backend( cfg.obtain('datalad.repo.backend'), persistent=True, commit=False) - add_to_git[tbds.repo.pathobj / '.gitattributes'] = { + add_to_git[tbrepo.pathobj / '.gitattributes'] = { 'type': 'file', 'state': 'added'} # make sure that v6 annex repos never commit content under .datalad @@ -375,7 +375,7 @@ ('metadata/objects/**', 'annex.largefiles', '({})'.format(cfg.obtain( 'datalad.metadata.create-aggregate-annex-limit')))) - attrs = tbds.repo.get_gitattributes( + attrs = tbrepo.get_gitattributes( [op.join('.datalad', i[0]) for i in attrs_cfg]) set_attrs = [] for p, k, v in attrs_cfg: @@ -383,18 +383,18 @@ op.join('.datalad', p), {}).get(k, None) == v: set_attrs.append((p, {k: v})) if set_attrs: - tbds.repo.set_gitattributes( + tbrepo.set_gitattributes( set_attrs, attrfile=op.join('.datalad', '.gitattributes')) # prevent git annex from ever annexing .git* stuff (gh-1597) - attrs = tbds.repo.get_gitattributes('.git') + attrs = tbrepo.get_gitattributes('.git') if not attrs.get('.git', {}).get( 'annex.largefiles', None) == 'nothing': - tbds.repo.set_gitattributes([ + tbrepo.set_gitattributes([ ('**/.git*', {'annex.largefiles': 'nothing'})]) # must use the repo.pathobj as this will have resolved symlinks - add_to_git[tbds.repo.pathobj / '.gitattributes'] = { + add_to_git[tbrepo.pathobj / '.gitattributes'] = { 'type': 'file', 'state': 'untracked'} @@ -404,10 +404,11 @@ # Note, that Dataset property `id` will change when we unset the # respective config. Therefore store it before: tbds_id = tbds.id - if id_var in tbds.config: + tbds_config = tbds.config + if id_var in tbds_config: # make sure we reset this variable completely, in case of a # re-create - tbds.config.unset(id_var, where='dataset') + tbds_config.unset(id_var, where='dataset') if _seed is None: # just the standard way @@ -415,7 +416,7 @@ else: # Let's generate preseeded ones uuid_id = str(uuid.UUID(int=random.getrandbits(128))) - tbds.config.add( + tbds_config.add( id_var, tbds_id if tbds_id is not None else uuid_id, where='dataset', @@ -427,21 +428,21 @@ # a dedicated argument, because it is sufficient for the cmdline # and unnecessary for the Python API (there could simply be a # subsequence ds.config.add() call) - for k, v in tbds.config.overrides.items(): - tbds.config.add(k, v, where='local', reload=False) + for k, v in tbds_config.overrides.items(): + tbds_config.add(k, v, where='local', reload=False) # all config manipulation is done -> fll reload - tbds.config.reload() + tbds_config.reload() # must use the repo.pathobj as this will have resolved symlinks - add_to_git[tbds.repo.pathobj / '.datalad'] = { + add_to_git[tbrepo.pathobj / '.datalad'] = { 'type': 'directory', 'state': 'untracked'} # save everything, we need to do this now and cannot merge with the # call below, because we may need to add this subdataset to a parent # but cannot until we have a first commit - tbds.repo.save( + tbrepo.save( message='[DATALAD] new dataset', git=True, # we have to supply our own custom status, as the repo does diff -Nru datalad-0.12.4/datalad/core/local/tests/test_save.py datalad-0.12.6/datalad/core/local/tests/test_save.py --- datalad-0.12.4/datalad/core/local/tests/test_save.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/core/local/tests/test_save.py 2020-04-23 18:42:28.000000000 +0000 @@ -397,8 +397,6 @@ path=op.join(ds.path, 'dir'), gitmodule_url='./dir', gitmodule_name='dir', - # but also the branch, by default - gitmodule_branch='master', ) # create another one other = create(other) diff -Nru datalad-0.12.4/datalad/distribution/create_sibling.py datalad-0.12.6/datalad/distribution/create_sibling.py --- datalad-0.12.4/datalad/distribution/create_sibling.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/distribution/create_sibling.py 2020-04-23 18:42:28.000000000 +0000 @@ -98,6 +98,7 @@ # see gh-1188 remoteds_path = normpath(opj(target_dir, ds_name)) + ds_repo = ds.repo # construct a would-be ssh url based on the current dataset's path ssh_url.path = remoteds_path ds_sshurl = ssh_url.as_str() @@ -171,7 +172,7 @@ # if we succeeded in removing it path_exists = False # Since it is gone now, git-annex also should forget about it - remotes = ds.repo.get_remotes() + remotes = ds_repo.get_remotes() if name in remotes: # so we had this remote already, we should announce it dead # XXX what if there was some kind of mismatch and this name @@ -181,9 +182,9 @@ "Announcing existing remote %s dead to annex and removing", name ) - if isinstance(ds.repo, AnnexRepo): - ds.repo.set_remote_dead(name) - ds.repo.remove_remote(name) + if isinstance(ds_repo, AnnexRepo): + ds_repo.set_remote_dead(name) + ds_repo.remove_remote(name) elif existing == 'reconfigure': lgr.info(_msg + " Will only reconfigure") only_reconfigure = True @@ -274,6 +275,20 @@ " and run with --existing=reconfigure", ssh.get_git_version()) + branch = ds_repo.get_active_branch() + if branch is not None: + if hasattr(ds_repo, "get_corresponding_branch"): + # ^ TODO: Drop this when this change hits master, where GitRepo has + # a .get_corresponding_branch method. + branch = ds_repo.get_corresponding_branch(branch) or branch + if branch != "master": + # Setting the HEAD for the created sibling to the original + # repo's current branch should be unsurprising, and it + # helps with consumers that don't properly handle the + # default master with no commits. See gh-4349. + ssh("git -C {} symbolic-ref HEAD refs/heads/{}" + .format(sh_quote(remoteds_path), branch)) + if install_postupdate_hook: # enable metadata refresh on dataset updates to publication server lgr.info("Enabling git post-update hook ...") @@ -489,7 +504,7 @@ # for now assuming hierarchical setup # (TODO: to be able to destinguish between the two, probably # needs storing datalad.*.target_dir to have %RELNAME in there) - sshurl = slash_join(super_url, relpath(ds.path, super_ds.path)) + sshurl = slash_join(super_url, relpath(refds_path, super_ds.path)) # check the login URL sshri = RI(sshurl) @@ -606,7 +621,7 @@ path = _create_dataset_sibling( name, current_ds, - ds.path, + refds_path, ssh, replicate_local_structure, sshri, @@ -636,7 +651,7 @@ remote_repos_to_run_hook_for.append((path, currentds_ap)) # publish web-interface to root dataset on publication server - if current_ds.path == ds.path and ui: + if current_ds.path == refds_path and ui: lgr.info("Uploading web interface to %s" % path) try: CreateSibling.upload_web_interface(path, ssh, shared, ui) diff -Nru datalad-0.12.4/datalad/distribution/dataset.py datalad-0.12.6/datalad/distribution/dataset.py --- datalad-0.12.4/datalad/distribution/dataset.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/distribution/dataset.py 2020-04-23 18:42:28.000000000 +0000 @@ -337,8 +337,8 @@ ------- ConfigManager """ - - if self.repo is None: + repo = self.repo # local binding + if repo is None: # if there's no repo (yet or anymore), we can't read/write config at # dataset level, but only at user/system level # However, if this was the case before as well, we don't want a new @@ -348,7 +348,7 @@ self._cfg_bound = False else: - self._cfg = self.repo.config + self._cfg = repo.config self._cfg_bound = True return self._cfg diff -Nru datalad-0.12.4/datalad/distribution/tests/test_create_sibling.py datalad-0.12.6/datalad/distribution/tests/test_create_sibling.py --- datalad-0.12.4/datalad/distribution/tests/test_create_sibling.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/distribution/tests/test_create_sibling.py 2020-04-23 18:42:28.000000000 +0000 @@ -53,6 +53,7 @@ from datalad.utils import on_windows from datalad.utils import _path_ +from datalad.utils import Path import logging lgr = logging.getLogger('datalad.tests') @@ -630,3 +631,42 @@ # Takes too long so one will do with UI and another one without yield _test_target_ssh_inherit, 'manual', True # manual -- no load should be annex copied yield _test_target_ssh_inherit, 'backup', False # backup -- all data files + + +@skip_if_on_windows # create_sibling incompatible with win servers +@skip_ssh +@with_tempfile(mkdir=True) +@with_tempfile(mkdir=True) +def test_non_master_branch(src_path, target_path): + src_path = Path(src_path) + target_path = Path(target_path) + + ds_a = Dataset(src_path).create() + # Rename rather than checking out another branch so that master + # doesn't exist in any state. + ds_a.repo.call_git(["branch", "-m", "master", "other"]) + (ds_a.pathobj / "afile").write_text("content") + sa = ds_a.create("sub-a") + sa.repo.checkout("other-sub", ["-b"]) + ds_a.create("sub-b") + + ds_a.save() + ds_a.create_sibling( + name="sib", recursive=True, + sshurl="ssh://datalad-test" + str(target_path / "b")) + ds_a.publish(to="sib", transfer_data="all") + + ds_b = Dataset(target_path / "b") + + def get_branch(repo): + return repo.get_corresponding_branch() or repo.get_active_branch() + + # The HEAD for the create-sibling matches what the branch was in + # the original repo. + eq_(get_branch(ds_b.repo), "other") + ok_((ds_b.pathobj / "afile").exists()) + + eq_(get_branch(Dataset(target_path / "b" / "sub-a").repo), + "other-sub") + eq_(get_branch(Dataset(target_path / "b" / "sub-b").repo), + "master") diff -Nru datalad-0.12.4/datalad/distribution/tests/test_update.py datalad-0.12.6/datalad/distribution/tests/test_update.py --- datalad-0.12.4/datalad/distribution/tests/test_update.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/distribution/tests/test_update.py 2020-04-23 18:42:28.000000000 +0000 @@ -442,3 +442,29 @@ assert_status('ok', ds.update()) # ATM we do not support multi-way merges assert_status('impossible', ds.update(merge=True, on_failure='ignore')) + + +@with_tempfile(mkdir=True) +def test_update_unborn_master(path): + ds_a = Dataset(op.join(path, "ds-a")).create() + ds_a.repo.call_git(["branch", "-m", "master", "other"]) + ds_a.repo.checkout("master", options=["--orphan"]) + ds_b = install(source=ds_a.path, path=op.join(path, "ds-b")) + + ds_a.repo.checkout("other") + (ds_a.pathobj / "foo").write_text("content") + ds_a.save() + + # clone() will try to switch away from an unborn branch if there + # is another ref available. Reverse these efforts so that we can + # test that update() fails reasonably here because we should still + # be able to update from remotes that datalad didn't clone. + ds_b.repo.update_ref("HEAD", "refs/heads/master", symbolic=True) + assert_false(ds_b.repo.commit_exists("HEAD")) + assert_status("impossible", + ds_b.update(merge=True, on_failure="ignore")) + + ds_b.repo.checkout("other") + assert_status("ok", + ds_b.update(merge=True, on_failure="ignore")) + eq_(ds_a.repo.get_hexsha(), ds_b.repo.get_hexsha()) diff -Nru datalad-0.12.4/datalad/distribution/update.py datalad-0.12.6/datalad/distribution/update.py --- datalad-0.12.4/datalad/distribution/update.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/distribution/update.py 2020-04-23 18:42:28.000000000 +0000 @@ -168,12 +168,18 @@ # NOTE if any further acces to `repo` is needed, reevaluate # ds.repo again, as it might have be converted from an GitRepo # to an AnnexRepo - if merge: - for fr in _update_repo(ds, sibling_, reobtain_data): - yield fr res['status'] = 'ok' + if merge: + if ds.repo.commit_exists("HEAD"): + for fr in _update_repo(ds, sibling_, reobtain_data): + yield fr + save_paths.append(ds.path) + else: + res["status"] = "impossible" + res["message"] = ("No commits on branch '%s'", + repo.get_active_branch()) yield res - save_paths.append(ds.path) + # we need to save updated states only if merge was requested -- otherwise # it was a pure fetch if merge and recursive: diff -Nru datalad-0.12.4/datalad/interface/common_cfg.py datalad-0.12.6/datalad/interface/common_cfg.py --- datalad-0.12.4/datalad/interface/common_cfg.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/interface/common_cfg.py 2020-04-23 18:42:28.000000000 +0000 @@ -307,6 +307,13 @@ 'text': 'Git-annex large files expression (see https://git-annex.branchable.com/tips/largefiles; given expression will be wrapped in parentheses)'}), 'default': 'anything', }, + 'datalad.runtime.max-annex-jobs': { + 'ui': ('question', { + 'title': 'Maximum number of git-annex jobs to request when "jobs" option set to "auto" (default)', + 'text': 'Set this value to enable parallel annex jobs that may speed up certain operations (e.g. get file content). The effective number of jobs will not exceed the number of available CPU cores (or 3 if there is less than 3 cores).'}), + 'type': EnsureInt(), + 'default': 1, + }, 'datalad.runtime.raiseonerror': { 'ui': ('question', { 'title': 'Error behavior', diff -Nru datalad-0.12.4/datalad/interface/common_opts.py datalad-0.12.6/datalad/interface/common_opts.py --- datalad-0.12.4/datalad/interface/common_opts.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/interface/common_opts.py 2020-04-23 18:42:28.000000000 +0000 @@ -146,7 +146,9 @@ metavar="NJOBS", default='auto', constraints=EnsureInt() | EnsureNone() | EnsureChoice('auto'), - doc="""how many parallel jobs (where possible) to use.""") + doc="""how many parallel jobs (where possible) to use. "auto" corresponds + to the number defined by 'datalad.runtime.max-annex-jobs' configuration + item""") verbose = Parameter( args=("-v", "--verbose",), diff -Nru datalad-0.12.4/datalad/interface/__init__.py datalad-0.12.6/datalad/interface/__init__.py --- datalad-0.12.4/datalad/interface/__init__.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/interface/__init__.py 2020-04-23 18:42:28.000000000 +0000 @@ -91,6 +91,7 @@ # Some known extensions and their commands to suggest whenever lookup fails _known_extension_commands = { + 'datalad-container': ('containers-list', 'containers-remove', 'containers-add', 'containers-run'), 'datalad-crawler': ('crawl', 'crawl-init'), 'datalad-neuroimaging': ('bids2scidata',) } diff -Nru datalad-0.12.4/datalad/log.py datalad-0.12.6/datalad/log.py --- datalad-0.12.4/datalad/log.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/log.py 2020-04-23 18:42:28.000000000 +0000 @@ -213,6 +213,7 @@ label=getattr(record, 'dlm_progress_label', ''), unit=getattr(record, 'dlm_progress_unit', ''), total=getattr(record, 'dlm_progress_total', None)) + pbar.start() self.pbars[pid] = pbar elif update is None: # not an update -> done diff -Nru datalad-0.12.4/datalad/metadata/search.py datalad-0.12.6/datalad/metadata/search.py --- datalad-0.12.4/datalad/metadata/search.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/metadata/search.py 2020-04-23 18:42:28.000000000 +0000 @@ -39,10 +39,12 @@ from datalad.consts import SEARCH_INDEX_DOTGITDIR from datalad.utils import ( - assure_list, assure_iter, unicode_srctypes, as_unicode, + as_unicode, + assure_list, assure_unicode, get_suggestions_msg, - unique, + shortened_repr, + unicode_srctypes, ) from datalad.support.exceptions import NoDatasetFound from datalad.ui import ui @@ -252,7 +254,15 @@ def __call__(self, query, max_nresults=None): raise NotImplementedError - def show_keys(self, *args): + @classmethod + def _key_matches(cls, k, regexes): + """Return which regex the key matches + """ + for regex in regexes: + if re.search(regex, k): + return regex + + def show_keys(self, *args, **kwargs): raise NotImplementedError(args) def get_query(self, query): @@ -280,13 +290,23 @@ self.index_dir = opj(str(self.ds.repo.dot_git), SEARCH_INDEX_DOTGITDIR) self._mk_search_index(force_reindex) - def show_keys(self, mode): + def show_keys(self, mode, regexes=None): + """ + + Parameters + ---------- + mode: {"name"} + regexes: list of regex + Which keys to bother working on + """ if mode != 'name': raise NotImplementedError( "ATM %s can show only names, so please use show_keys with 'name'" % self.__class__.__name__ ) for k in self.idx_obj.schema.names(): + if regexes and not self._key_matches(k, regexes): + continue print(u'{}'.format(k)) def get_query(self, query): @@ -753,45 +773,58 @@ ) break - def show_keys(self, mode=None): - maxl = 100 # maximal line length for unique values in mode=short + def show_keys(self, mode=None, regexes=None): + """ + + Parameters + ---------- + mode: {"name", "short", "full"} + regexes: list of regex + Which keys to bother working on + """ + maxl = 100 # approx maximal line length for unique values in mode=short # use a dict already, later we need to map to a definition # meanwhile map to the values keys = self._get_keys(mode) for k in sorted(keys): + if regexes and not self._key_matches(k, regexes): + continue if mode == 'name': print(k) continue # do a bit more stat = keys[k] - uvals = stat.uvals - if mode == 'short': - # show only up to X uvals - if len(stat.uvals) > 10: - uvals = {v for i, v in enumerate(uvals) if i < 10} - # all unicode still scares yoh -- he will just use repr - # def conv(s): - # try: - # return '{}'.format(s) - # except UnicodeEncodeError: - # return assure_unicode(s).encode('utf-8') + all_uvals = uvals = sorted(stat.uvals) + stat.uvals_str = assure_unicode( - "{} unique values: {}".format( - len(stat.uvals), ', '.join(map(repr, uvals)))) + "{} unique values: ".format(len(all_uvals)) + ) + if mode == 'short': - if len(stat.uvals) > 10: - stat.uvals_str += ', ...' - if len(stat.uvals_str) > maxl: - stat.uvals_str = stat.uvals_str[:maxl-4] + ' ....' + # show only up until we fill maxl + uvals_str = '' + uvals = [] + for v in all_uvals: + appendix = ('; ' if uvals else '') + v + if len(uvals_str) + len(appendix) > maxl - len(stat.uvals_str): + break + uvals.append(v) + uvals_str += appendix elif mode == 'full': pass else: raise ValueError( "Unknown value for stats. Know full and short") + stat.uvals_str += '; '.join(uvals) + + if len(all_uvals) > len(uvals): + stat.uvals_str += \ + '; +%s' % single_or_plural("value", "values", len(all_uvals) - len(uvals), True) + print( u'{k}\n in {stat.ndatasets} datasets\n has {stat.uvals_str}'.format( k=k, stat=stat @@ -834,20 +867,20 @@ keys[k].ndatasets += 1 if mode == 'name': continue - try: - kvals_set = assure_iter(kvals, set) - except TypeError: - # TODO: may be do show hashable ones??? - nunhashable = sum( - isinstance(x, collections.Hashable) for x in kvals - ) - kvals_set = { - 'unhashable %d out of %d entries' - % (nunhashable, len(kvals)) - } - keys[k].uvals |= kvals_set + keys[k].uvals |= self.get_repr_uvalues(kvals) return keys + def get_repr_uvalues(self, kvals): + kvals_set = set() + if not kvals: + return kvals_set + kvals_iter = ( + kvals + if hasattr(kvals, '__iter__') and not isinstance(kvals, (str, bytes)) + else [kvals] + ) + return set(shortened_repr(x, 50) for x in kvals_iter) + def get_query(self, query): query = assure_list(query) simple_fieldspec = re.compile(r"(?P\S*?):(?P.*)") @@ -871,8 +904,8 @@ self._queried_keys.append(None) # expand matches, compile expressions query = [ - {k: re.compile(self._xfm_query(v)) for k, v in q.groupdict().items()} - if hasattr(q, 'groupdict') else re.compile(self._xfm_query(q)) + {k: self._compile_query(v) for k, v in q.groupdict().items()} + if hasattr(q, 'groupdict') else self._compile_query(q) for q in query_rec_matches ] @@ -888,6 +921,19 @@ # implement potential transformations of regex before they get compiled return q + def _compile_query(self, q): + """xfm and compile the query, with informative exception if query is incorrect + """ + q_xfmed = self._xfm_query(q) + try: + return re.compile(q_xfmed) + except re.error as exc: + omsg = " (original: '%s')" % q if q != q_xfmed else '' + raise ValueError( + "regular expression '%s'%s is incorrect: %s" + % (q_xfmed, omsg, exc) + ) + def get_nohits_msg(self): """Given the query and performed search, provide recommendation @@ -1097,9 +1143,10 @@ Examples: List names of search index fields (auto-discovered from the set of - indexed datasets):: + indexed datasets) which either have a field starting with "age" or + "gender":: - % datalad search --mode autofield --show-keys name + % datalad search --mode autofield --show-keys name '\.age' '\.gender' Fuzzy search for datasets with an author that is specified in a particular metadata field:: @@ -1177,6 +1224,8 @@ only the name is printed one per line. If 'short' or 'full', statistics (in how many datasets, and how many unique values) are printed. 'short' truncates the listing of unique values. + QUERY, if provided, is regular expressions any of which keys should + contain. No other action is performed (except for reindexing), even if other arguments are given. Each key is accompanied by a term definition in parenthesis (TODO). In most cases a definition is given in the form @@ -1236,7 +1285,7 @@ searcher = searcher(ds, force_reindex=force_reindex) if show_keys: - searcher.show_keys(show_keys) + searcher.show_keys(show_keys, regexes=query) return if not query: diff -Nru datalad-0.12.4/datalad/metadata/tests/test_search.py datalad-0.12.6/datalad/metadata/tests/test_search.py --- datalad-0.12.4/datalad/metadata/tests/test_search.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/metadata/tests/test_search.py 2020-04-23 18:42:28.000000000 +0000 @@ -12,6 +12,7 @@ import logging from shutil import copy from unittest.mock import patch +import os from os import makedirs from os.path import join as opj from os.path import dirname @@ -22,7 +23,10 @@ swallow_logs, swallow_outputs, ) -from datalad.tests.utils import assert_in +from datalad.tests.utils import ( + assert_in, + assert_re_in, +) from datalad.tests.utils import assert_result_count from datalad.tests.utils import assert_is_generator from datalad.tests.utils import with_tempfile @@ -139,37 +143,6 @@ @with_tempfile -def test_our_metadataset_search(tdir): - # TODO renable when a dataset with new aggregated metadata is - # available at some public location - raise SkipTest - # smoke test for basic search operations on our super-megadataset - # expensive operation but ok - #ds = install( - # path=tdir, - # # TODO renable test when /// metadata actually conforms to the new metadata - # #source="///", - # source="smaug:/mnt/btrfs/datasets-meta6-4/datalad/crawl", - # result_xfm='datasets', return_type='item-or-list') - assert list(ds.search('haxby')) - assert_result_count( - ds.search('id:873a6eae-7ae6-11e6-a6c8-002590f97d84', mode='textblob'), - 1, - type='dataset', - path=opj(ds.path, 'crcns', 'pfc-2')) - - # there is a problem with argparse not decoding into utf8 in PY2 - from datalad.cmdline.tests.test_main import run_main - # TODO: make it into an independent lean test - from datalad.cmd import Runner - out, err = Runner(cwd=ds.path)('datalad search Buzsáki') - assert_in('crcns/pfc-2 ', out) # has it in description - # and then another aspect: this entry it among multiple authors, need to - # check if aggregating them into a searchable entity was done correctly - assert_in('crcns/hc-1 ', out) - - -@with_tempfile def test_search_non_dataset(tdir): from datalad.support.gitrepo import GitRepo GitRepo(tdir, create=True) @@ -250,6 +223,33 @@ type """ + # test default behavior while limiting set of keys reported + with swallow_outputs() as cmo: + ds.search(['\.id', 'artist$'], show_keys='short') + out_lines = [l for l in cmo.out.split(os.linesep) if l] + # test that only the ones matching were returned + assert_equal( + [l for l in out_lines if not l.startswith(' ')], + ['audio.music-artist', 'datalad_core.id'] + ) + # more specific test which would also test formatting + assert_equal( + out_lines, + ['audio.music-artist', + ' in 1 datasets', " has 1 unique values: 'dlartist'", + 'datalad_core.id', + ' in 1 datasets', + # we have them sorted + " has 1 unique values: '%s'" % ds.id + ] + ) + + with assert_raises(ValueError) as cme: + ds.search('*wrong') + assert_re_in( + r"regular expression '\(\?i\)\*wrong' \(original: '\*wrong'\) is incorrect: ", + str(cme.exception)) + # check generated autofield index keys with swallow_outputs() as cmo: ds.search(mode='autofield', show_keys='name') diff -Nru datalad-0.12.4/datalad/support/annexrepo.py datalad-0.12.6/datalad/support/annexrepo.py --- datalad-0.12.4/datalad/support/annexrepo.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/support/annexrepo.py 2020-04-23 18:42:28.000000000 +0000 @@ -98,8 +98,9 @@ lgr = logging.getLogger('datalad.annex') -# Limit to # of CPUs and up to 8, but at least 3 to start with -N_AUTO_JOBS = min(8, max(3, cpu_count())) +# TODO Constant is no longer used, but left defined to avoid breakage in +# dependent code. Remove in 0.14 release. +N_AUTO_JOBS = 1 class AnnexRepo(GitRepo, RepoInterface): @@ -300,6 +301,9 @@ if self._ALLOW_LOCAL_URLS: self._allow_local_urls() + # will be evaluated lazily + self._n_auto_jobs = None + def _allow_local_urls(self): """Allow URL schemes and addresses which potentially could be harmful. @@ -2144,7 +2148,14 @@ annex_options += ['--json-progress'] if jobs == 'auto': - jobs = N_AUTO_JOBS + # Limit to # of CPUs (but at least 3 to start with) + # and also an additional config constraint (by default 1 + # due to https://github.com/datalad/datalad/issues/4404) + jobs = self._n_auto_jobs or min( + self.config.obtain('datalad.runtime.max-annex-jobs'), + max(3, cpu_count())) + # cache result to avoid repeated calls to cpu_count() + self._n_auto_jobs = jobs if jobs and jobs != 1: annex_options += ['-J%d' % jobs] if opts: @@ -2418,7 +2429,11 @@ # and that they all have 'file' equal to the passed one out = {} for j, f in zip(json_objects, files): - assert(j.pop('file') == f) + # Starting with version of annex 8.20200330-100-g957a87b43 + # annex started to normalize relative paths. + # ref: https://github.com/datalad/datalad/issues/4431 + # Use normpath around each side to ensure it is the same file + assert normpath(j.pop('file')) == normpath(f) if not j['success']: j = None else: @@ -3346,9 +3361,11 @@ for j in self._run_annex_command_json(cmd, opts=opts, files=files): path = self.pathobj.joinpath(ut.PurePosixPath(j['file'])) rec = info.get(path, None) - if init is not None and rec is None: - # init constraint knows nothing about this path -> skip - continue + if rec is None: + if init is not None: + # init constraint knows nothing about this path -> skip + continue + rec = {} rec.update({'{}{}'.format(key_prefix, k): j[k] for k in j if k != 'file'}) if 'bytesize' in rec: diff -Nru datalad-0.12.4/datalad/support/exceptions.py datalad-0.12.6/datalad/support/exceptions.py --- datalad-0.12.4/datalad/support/exceptions.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/support/exceptions.py 2020-04-23 18:42:28.000000000 +0000 @@ -163,7 +163,8 @@ """ pattern = \ - re.compile(r'ignored by one of your .gitignore files:\s*(.*)^Use -f.*$', + re.compile(r'ignored by one of your .gitignore files:\s*(.*)' + r'^(?:hint: )?Use -f.*$', flags=re.MULTILINE | re.DOTALL) def __init__(self, cmd="", msg="", code=None, stdout="", stderr="", diff -Nru datalad-0.12.4/datalad/support/github_.py datalad-0.12.6/datalad/support/github_.py --- datalad-0.12.4/datalad/support/github_.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/support/github_.py 2020-04-23 18:42:28.000000000 +0000 @@ -9,7 +9,10 @@ """Helpers for interaction with GitHub """ from .. import cfg -from ..consts import CONFIG_HUB_TOKEN_FIELD +from ..consts import ( + CONFIG_HUB_TOKEN_FIELD, + GITHUB_LOGIN_URL, +) from ..dochelpers import exc_str from ..downloaders.credentials import UserPassword from ..ui import ui @@ -119,10 +122,6 @@ # We got here so time to try credentials - # make it per user if github_login was provided. People might want to use - # different credentials etc - cred_identity = "%s@github" % github_login if github_login else "github" - # if login and passwd were provided - try that one first try_creds = github_login and github_passwd try_login = bool(github_login) @@ -136,7 +135,9 @@ user_name = github_login try_creds = None else: - cred = UserPassword(cred_identity, 'https://github.com/login') + # make it per user if github_login was provided. People might want + # to use different credentials etc + cred = _get_github_cred(github_login) # if github_login was provided, we should first try it as is, # and only ask for password if not cred.is_known: @@ -207,6 +208,12 @@ break +def _get_github_cred(github_login=None): + """Helper to create github credential""" + cred_identity = "%s@github" % github_login if github_login else "github" + return UserPassword(cred_identity, GITHUB_LOGIN_URL) + + def _get_2fa_token(user): one_time_password = ui.question( "2FA one time password", hidden=True, repeat=False @@ -272,12 +279,25 @@ for ses, cred in _gen_github_ses(github_login, github_passwd): if github_organization: try: - yield ses.get_organization(github_organization), cred + org = ses.get_organization(github_organization) + lgr.info( + "Successfully obtained information about organization %s " + "using %s credential", github_organization, cred + ) + yield org, cred except gh.UnknownObjectException as e: # yoh thinks it might be due to insufficient credentials? raise ValueError('unknown organization "{}" [{}]'.format( github_organization, exc_str(e))) + except gh.BadCredentialsException as e: + lgr.warning( + "Having authenticated using %s, we failed (%s) to access " + "information about organization %s. We will try next " + "authentication method (if any left available)", + cred or "token", e, github_organization + ) + continue else: yield ses.get_user(), cred @@ -392,4 +412,4 @@ return '{}:github/.../{}'.format(access_protocol, reponame), False else: # report URL for given access protocol - return get_repo_url(repo, access_protocol, github_login), False \ No newline at end of file + return get_repo_url(repo, access_protocol, github_login), False diff -Nru datalad-0.12.4/datalad/support/gitrepo.py datalad-0.12.6/datalad/support/gitrepo.py --- datalad-0.12.4/datalad/support/gitrepo.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/support/gitrepo.py 2020-04-23 18:42:28.000000000 +0000 @@ -67,7 +67,6 @@ from datalad.consts import ( GIT_SSH_COMMAND, - ADJUSTED_BRANCH_EXPR, ) from datalad.dochelpers import exc_str import datalad.utils as ut @@ -2738,7 +2737,18 @@ if GitRepo.is_valid_repo(self.pathobj / path): subrepo = GitRepo(self.pathobj / path, create=False) subbranch = subrepo.get_active_branch() if subrepo else None - subbranch_hexsha = subrepo.get_hexsha(subbranch) if subrepo else None + try: + subbranch_hexsha = subrepo.get_hexsha(subbranch) if subrepo else None + except ValueError: + if subrepo.commit_exists("HEAD"): + # Not what we thought it was. Reraise. + raise + else: + raise ValueError( + "Cannot add submodule that has an unborn branch " + "checked out: {}" + .format(subrepo.path)) + else: subrepo = None subbranch = None @@ -3829,16 +3839,11 @@ if sm_props.get('type', None) == 'directory'] to_add_submodules = _prune_deeper_repos(to_add_submodules) for cand_sm in to_add_submodules: - branch = self.get_active_branch() - adjusted_match = ADJUSTED_BRANCH_EXPR.match( - branch if branch else '') try: self.add_submodule( str(cand_sm.relative_to(self.pathobj)), url=None, name=None, - branch=adjusted_match.group('name') if adjusted_match - else branch ) except (CommandError, InvalidGitRepositoryError) as e: yield get_status_dict( diff -Nru datalad-0.12.4/datalad/support/json_py.py datalad-0.12.6/datalad/support/json_py.py --- datalad-0.12.4/datalad/support/json_py.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/support/json_py.py 2020-04-23 18:42:28.000000000 +0000 @@ -148,7 +148,7 @@ yield loads(cont_line) cont_line = u'' if cont_line: # The last line didn't end with a new line. - yield cont_line + yield loads(cont_line) def load_xzstream(fname): diff -Nru datalad-0.12.4/datalad/support/network.py datalad-0.12.6/datalad/support/network.py --- datalad-0.12.4/datalad/support/network.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/support/network.py 2020-04-23 18:42:28.000000000 +0000 @@ -428,7 +428,8 @@ ---------- ri: str, optional String version of a resource specific for this class. If you would like - a type of the resource be deduced, use RI(ri) + a type of the resource be deduced, use RI(ri). Note that this value + will be passed to str(), so you do not have to cast it yourself. **fields: dict, optional The values for the fields defined in _FIELDS class variable. """ @@ -439,11 +440,12 @@ self._fields = self._get_blank_fields() if ri is not None: + ri = str(ri) fields = self._str_to_fields(ri) self._set_from_fields(**fields) # If was initialized from a string representation - if self._str is not None: + if lgr.isEnabledFor(logging.DEBUG) and self._str is not None: # well -- some ris might not unparse identically back # strictly speaking, but let's assume they do ri_ = self.as_str() diff -Nru datalad-0.12.4/datalad/support/tests/test_annexrepo.py datalad-0.12.6/datalad/support/tests/test_annexrepo.py --- datalad-0.12.4/datalad/support/tests/test_annexrepo.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/support/tests/test_annexrepo.py 2020-04-23 18:42:28.000000000 +0000 @@ -682,7 +682,12 @@ def test_AnnexRepo_always_commit(path): repo = AnnexRepo(path) - runner = Runner(cwd=path) + + def get_annex_commit_counts(): + return len(repo.get_revisions("git-annex")) + + n_annex_commits_initial = get_annex_commit_counts() + file1 = get_most_obscure_supported_name() + "_1" file2 = get_most_obscure_supported_name() + "_2" with open(opj(path, file1), 'w') as f: @@ -701,46 +706,30 @@ # check git log of git-annex branch: # expected: initial creation, update (by annex add) and another # update (by annex log) - out, err = runner.run(['git', 'log', 'git-annex']) - num_commits = len([commit - for commit in out.rstrip(os.linesep).split('\n') - if commit.startswith('commit')]) - eq_(num_commits, 3) - - repo.always_commit = False - repo.add(file2) - - # No additional git commit: - out, err = runner.run(['git', 'log', 'git-annex']) - num_commits = len([commit - for commit in out.rstrip(os.linesep).split('\n') - if commit.startswith('commit')]) - eq_(num_commits, 3) - - repo.always_commit = True - - # Still one commit only in git-annex log, - # but 'git annex log' was called when always_commit was true again, - # so it should commit the addition at the end. Calling it again should then - # show two commits. - out, err = repo._run_annex_command('log') - out_list = out.rstrip(os.linesep).splitlines() - eq_(len(out_list), 2, "Output:\n%s" % out_list) - assert_in(file1, out_list[0]) - assert_in("recording state in git", out_list[1]) + eq_(get_annex_commit_counts(), n_annex_commits_initial + 1) + + with patch.object(repo, "always_commit", False): + repo.add(file2) + + # No additional git commit: + eq_(get_annex_commit_counts(), n_annex_commits_initial + 1) + + out, err = repo._run_annex_command('log') + + # And we see only the file before always_commit was set to false: + assert_in(file1, out) + assert_not_in(file2, out) + + # With always_commit back to True, do something that will trigger a commit + # on the annex branches. + repo.sync() out, err = repo._run_annex_command('log') - out_list = out.rstrip(os.linesep).splitlines() - eq_(len(out_list), 2, "Output:\n%s" % out_list) - assert_in(file1, out_list[0]) - assert_in(file2, out_list[1]) + assert_in(file1, out) + assert_in(file2, out) # Now git knows as well: - out, err = runner.run(['git', 'log', 'git-annex']) - num_commits = len([commit - for commit in out.rstrip(os.linesep).split('\n') - if commit.startswith('commit')]) - eq_(num_commits, 4) + eq_(get_annex_commit_counts(), n_annex_commits_initial + 2) # https://github.com/datalad/datalad/pull/3975/checks?check_run_id=369789014#step:8:445 diff -Nru datalad-0.12.4/datalad/support/tests/test_fileinfo.py datalad-0.12.6/datalad/support/tests/test_fileinfo.py --- datalad-0.12.4/datalad/support/tests/test_fileinfo.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/support/tests/test_fileinfo.py 2020-04-23 18:42:28.000000000 +0000 @@ -226,3 +226,34 @@ eval_availability=True)): assert_in(testfile, ai) assert_equal(ai[testfile]['has_content'], False) + + +@with_tempfile +def test_annexinfo_init(path): + ds = Dataset(path).create() + foo = ds.pathobj / "foo" + foo_cont = b"foo content" + foo.write_bytes(foo_cont) + bar = ds.pathobj / "bar" + bar.write_text(u"bar content") + ds.save() + + # Custom init limits report, with original dict getting updated. + cinfo_custom_init = ds.repo.get_content_annexinfo( + init={foo: {"bytesize": 0, + "this-is-surely-only-here": "right?"}}) + assert_not_in(bar, cinfo_custom_init) + assert_in(foo, cinfo_custom_init) + assert_equal(cinfo_custom_init[foo]["bytesize"], len(foo_cont)) + assert_equal(cinfo_custom_init[foo]["this-is-surely-only-here"], + "right?") + + # "git" injects get_content_info() values. + cinfo_init_git = ds.repo.get_content_annexinfo(init="git") + assert_in("gitshasum", cinfo_init_git[foo]) + + # init=None, on the other hand, does not. + cinfo_init_none = ds.repo.get_content_annexinfo(init=None) + assert_in(foo, cinfo_init_none) + assert_in(bar, cinfo_init_none) + assert_not_in("gitshasum", cinfo_init_none[foo]) diff -Nru datalad-0.12.4/datalad/support/tests/test_github_.py datalad-0.12.6/datalad/support/tests/test_github_.py --- datalad-0.12.4/datalad/support/tests/test_github_.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/support/tests/test_github_.py 2020-04-23 18:42:28.000000000 +0000 @@ -14,12 +14,30 @@ import github as gh from ..exceptions import AccessDeniedError -from ...tests.utils import assert_raises, assert_equal, eq_, assert_in - +from ...tests.utils import ( + assert_equal, + assert_greater, + assert_in, + assert_raises, + eq_, + patch_config, + skip_if, + skip_if_no_network, +) +from ...consts import ( + CONFIG_HUB_TOKEN_FIELD, +) from ...utils import swallow_logs from .. import github_ -from ..github_ import get_repo_url +from ..github_ import ( + _gen_github_entity, + _get_github_cred, + get_repo_url, +) + + +skip_if_no_github_cred = skip_if(cond=not _get_github_cred().is_known) def test_get_repo_url(): @@ -109,4 +127,20 @@ mock.patch.object(github_, '_make_github_repo', _make_github_repo): with assert_raises(AccessDeniedError) as cme: github_._make_github_repos(*args) - assert_in("Tried 3 times", str(cme.exception)) \ No newline at end of file + assert_in("Tried 3 times", str(cme.exception)) + + +@skip_if_no_network +@skip_if_no_github_cred +def test__gen_github_entity_organization(): + # to test effectiveness of the fix, we need to provide some + # token which would not work + with patch_config({CONFIG_HUB_TOKEN_FIELD: "ed51111111111111111111111111111111111111"}): + org_cred = next(_gen_github_entity(None, None, 'datalad-collection-1')) + assert len(org_cred) == 2, "we return organization and credential" + org, _ = org_cred + assert org + repos = list(org.get_repos()) + repos_names = [r.name for r in repos] + assert_greater(len(repos), 3) # we have a number of those + assert_in('datasets.datalad.org', repos_names) \ No newline at end of file diff -Nru datalad-0.12.4/datalad/support/tests/test_gitrepo.py datalad-0.12.6/datalad/support/tests/test_gitrepo.py --- datalad-0.12.4/datalad/support/tests/test_gitrepo.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/support/tests/test_gitrepo.py 2020-04-23 18:42:28.000000000 +0000 @@ -1023,6 +1023,19 @@ yield check_update_submodule_init_adjust_branch, False +@with_tempfile +def test_update_submodules_sub_on_unborn_branch(path): + repo = GitRepo(path, create=True) + repo.commit(msg="c0", options=["--allow-empty"]) + subrepo = GitRepo(op.join(path, "sub"), create=True) + subrepo.commit(msg="s c0", options=["--allow-empty"]) + repo.add_submodule(path="sub") + subrepo.checkout("other", options=["--orphan"]) + with assert_raises(ValueError) as cme: + repo.update_submodule(path="sub") + assert_in("unborn branch", str(cme.exception)) + + def test_GitRepo_get_submodules(): raise SkipTest("TODO") diff -Nru datalad-0.12.4/datalad/support/tests/test_json_py.py datalad-0.12.6/datalad/support/tests/test_json_py.py --- datalad-0.12.4/datalad/support/tests/test_json_py.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/support/tests/test_json_py.py 2020-04-23 18:42:28.000000000 +0000 @@ -47,6 +47,7 @@ result = list(load_stream(fname)) eq_(len(result), 2) eq_(result[0]["key0"], u"a
b") + eq_(result[1]["key1"], u"plain") def test_loads(): diff -Nru datalad-0.12.4/datalad/support/tests/test_network.py datalad-0.12.6/datalad/support/tests/test_network.py --- datalad-0.12.4/datalad/support/tests/test_network.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/support/tests/test_network.py 2020-04-23 18:42:28.000000000 +0000 @@ -161,7 +161,7 @@ eq_(ri, ri_) # just in case ;) above should fail first if smth is wrong if not exact_str: assert_in('Parsed version of', cml.out) - (eq_ if exact_str else neq_)(ri, str(ri_)) # that we can reconstruct it EXACTLY on our examples + (eq_ if exact_str else neq_)(str(ri), str(ri_)) # that we can reconstruct it EXACTLY on our examples # and that we have access to all those fields nok_(set(fields).difference(set(cls._FIELDS))) for f, v in fields.items(): @@ -279,8 +279,8 @@ # and now implicit paths or actually they are also "URI references" _check_ri("f", PathRI, localpath='f', path='f') _check_ri("f/s1", PathRI, localpath='f/s1', path='f/s1') - _check_ri(PurePosixPath("f"), PathRI, localpath='f', path='f', exact_str=False) - _check_ri(PurePosixPath("f/s1"), PathRI, localpath='f/s1', path='f/s1', exact_str=False) + _check_ri(PurePosixPath("f"), PathRI, localpath='f', path='f') + _check_ri(PurePosixPath("f/s1"), PathRI, localpath='f/s1', path='f/s1') # colons are problematic and might cause confusion into SSHRI _check_ri("f/s:1", PathRI, localpath='f/s:1', path='f/s:1') _check_ri("f/s:", PathRI, localpath='f/s:', path='f/s:') diff -Nru datalad-0.12.4/datalad/tests/test_config.py datalad-0.12.6/datalad/tests/test_config.py --- datalad-0.12.4/datalad/tests/test_config.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/tests/test_config.py 2020-04-23 18:42:28.000000000 +0000 @@ -48,6 +48,8 @@ [something] user = name=Jane Doe user = email=jd@example.com +novalue +empty = myint = 3 [onemore "complicated の beast with.dot"] @@ -67,7 +69,7 @@ assert_raises(ValueError, ConfigManager, source='dataset') # now read the example config cfg = ConfigManager(Dataset(opj(path, 'ds')), source='dataset') - assert_equal(len(cfg), 3) + assert_equal(len(cfg), 5) assert_in('something.user', cfg) # multi-value assert_equal(len(cfg['something.user']), 2) @@ -80,17 +82,21 @@ assert_true(cfg.has_option('something', 'user')) assert_false(cfg.has_option('something', 'us?er')) assert_false(cfg.has_option('some?thing', 'user')) - assert_equal(sorted(cfg.options('something')), ['myint', 'user']) + assert_equal(sorted(cfg.options('something')), ['empty', 'myint', 'novalue', 'user']) assert_equal(cfg.options(u'onemore.complicated の beast with.dot'), ['findme']) assert_equal( sorted(cfg.items()), [(u'onemore.complicated の beast with.dot.findme', '5.0'), + ('something.empty', ''), ('something.myint', '3'), + ('something.novalue', None), ('something.user', ('name=Jane Doe', 'email=jd@example.com'))]) assert_equal( sorted(cfg.items('something')), - [('something.myint', '3'), + [('something.empty', ''), + ('something.myint', '3'), + ('something.novalue', None), ('something.user', ('name=Jane Doe', 'email=jd@example.com'))]) # always get all values @@ -101,6 +107,12 @@ assert_equal(cfg.getfloat(u'onemore.complicated の beast with.dot', 'findme'), 5.0) assert_equal(cfg.getint('something', 'myint'), 3) assert_equal(cfg.getbool('something', 'myint'), True) + # git demands a key without value at all to be used as a flag, thus True + assert_equal(cfg.getbool('something', 'novalue'), True) + assert_equal(cfg.get('something.novalue'), None) + # empty value is False + assert_equal(cfg.getbool('something', 'empty'), False) + assert_equal(cfg.get('something.empty'), '') assert_equal(cfg.getbool('doesnot', 'exist', default=True), True) assert_raises(TypeError, cfg.getbool, 'something', 'user') diff -Nru datalad-0.12.4/datalad/tests/test_utils.py datalad-0.12.6/datalad/tests/test_utils.py --- datalad-0.12.4/datalad/tests/test_utils.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/tests/test_utils.py 2020-04-23 18:42:28.000000000 +0000 @@ -430,7 +430,7 @@ assert_equal( repr(buga()), - "buga(a=1, b=<<[0, 1, 2, 3, 4, 5, 6, ...>>, c=)" + "buga(a=1, b=<<[0, 1, 2, 3, 4++372 chars++ 99]>>, c=)" ) assert_equal(buga().some(), "some") diff -Nru datalad-0.12.4/datalad/utils.py datalad-0.12.6/datalad/utils.py --- datalad-0.12.4/datalad/utils.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/utils.py 2020-04-23 18:42:28.000000000 +0000 @@ -191,7 +191,11 @@ if hasattr(value, '__repr__') and (value.__repr__ is not object.__repr__): value_repr = repr(value) if not value_repr.startswith('<') and len(value_repr) > l: - value_repr = "<<%s...>>" % (value_repr[:l - 8]) + value_repr = "<<%s++%d chars++%s>>" % ( + value_repr[:l - 16], + len(value_repr) - (l - 16 + 4), + value_repr[-4:] + ) elif value_repr.startswith('<') and value_repr.endswith('>') and ' object at 0x': raise ValueError("I hate those useless long reprs") else: @@ -1058,6 +1062,73 @@ return newfunc +@optional_args +def collect_method_callstats(func): + """Figure out methods which call the method repeatedly on the same instance + + Use case(s): + - .repo is expensive since does all kinds of checks. + - .config is expensive transitively since it calls .repo each time + + + TODO: + - fancy one could look through the stack for the same id(self) to see if + that location is already in memo. That would hint to the cases where object + is not passed into underlying functions, causing them to redo the same work + over and over again + - ATM might flood with all "1 lines" calls which are not that informative. + The underlying possibly suboptimal use might be coming from their callers. + It might or not relate to the previous TODO + """ + from collections import defaultdict + import traceback + from time import time + memo = defaultdict(lambda: defaultdict(int)) # it will be a dict of lineno: count + # gross timing + times = [] + toppath = op.dirname(__file__) + op.sep + + @wraps(func) + def newfunc(*args, **kwargs): + try: + self = args[0] + stack = traceback.extract_stack() + caller = stack[-2] + stack_sig = \ + "{relpath}:{s.name}".format( + s=caller, relpath=op.relpath(caller.filename, toppath)) + sig = (id(self), stack_sig) + # we will count based on id(self) + wherefrom + memo[sig][caller.lineno] += 1 + t0 = time() + return func(*args, **kwargs) + finally: + times.append(time() - t0) + pass + + def print_stats(): + print("The cost of property {}:".format(func.__name__)) + if not memo: + print("None since no calls") + return + # total count + counts = {k: sum(v.values()) for k,v in memo.items()} + total = sum(counts.values()) + ids = {self_id for (self_id, _) in memo} + print(" Total: {} calls from {} objects with {} contexts taking {:.2f} sec" + .format(total, len(ids), len(memo), sum(times))) + # now we need to sort by value + for (self_id, caller), count in sorted(counts.items(), key=lambda x: x[1], reverse=True): + print(" {} {}: {} from {} lines" + .format(self_id, caller, count, len(memo[(self_id, caller)]))) + + # Upon total exit we print the stats + import atexit + atexit.register(print_stats) + + return newfunc + + # Borrowed from duecredit to wrap duecredit-handling to guarantee failsafe def never_fail(f): """Assure that function never fails -- all exceptions are caught diff -Nru datalad-0.12.4/datalad/version.py datalad-0.12.6/datalad/version.py --- datalad-0.12.4/datalad/version.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/datalad/version.py 2020-04-23 18:42:28.000000000 +0000 @@ -15,7 +15,7 @@ # Hard coded version, to be done by release process, # it is also "parsed" (not imported) by setup.py, that is why assigned as # __hardcoded_version__ later and not vise versa -__version__ = '0.12.4' +__version__ = '0.12.6' __hardcoded_version__ = __version__ __full_version__ = __version__ diff -Nru datalad-0.12.4/_datalad_build_support/setup.py datalad-0.12.6/_datalad_build_support/setup.py --- datalad-0.12.4/_datalad_build_support/setup.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/_datalad_build_support/setup.py 2020-04-23 18:42:28.000000000 +0000 @@ -233,7 +233,7 @@ categories[v.get('destination', 'misc')][term] = v for cat in categories: - with open(opj(opath, '{}.rst'.format(cat)), 'w') as rst: + with open(opj(opath, '{}.rst.in'.format(cat)), 'w') as rst: rst.write('.. glossary::\n') for term, v in sorted(categories[cat].items(), key=lambda x: x[0]): rst.write(_indent(term, '\n ')) diff -Nru datalad-0.12.4/debian/changelog datalad-0.12.6/debian/changelog --- datalad-0.12.4/debian/changelog 2020-03-23 17:26:07.000000000 +0000 +++ datalad-0.12.6/debian/changelog 2020-04-23 20:43:05.000000000 +0000 @@ -1,3 +1,12 @@ +datalad (0.12.6-1) unstable; urgency=medium + + * Fresh upstream release: + - should resolve possible issues with an upcoming release of git-annex + * debian/patches + - CPed 0001* was removed, deb_no_utf8 was updated + + -- Yaroslav Halchenko Thu, 23 Apr 2020 16:43:05 -0400 + datalad (0.12.4-2) unstable; urgency=medium * Add python3-distro to Depends of python3-datalad (needed for python>=3.8). diff -Nru datalad-0.12.4/debian/patches/0001-BF-match-hint-prefix-for-ignored-files-message-by-gi.patch datalad-0.12.6/debian/patches/0001-BF-match-hint-prefix-for-ignored-files-message-by-gi.patch --- datalad-0.12.4/debian/patches/0001-BF-match-hint-prefix-for-ignored-files-message-by-gi.patch 2020-03-23 17:26:07.000000000 +0000 +++ datalad-0.12.6/debian/patches/0001-BF-match-hint-prefix-for-ignored-files-message-by-gi.patch 1970-01-01 00:00:00.000000000 +0000 @@ -1,72 +0,0 @@ -From: Yaroslav Halchenko -Subject: compatibility with 2.26.0.rc2 git - -Origin: Debian -Forwarded: https://github.com/datalad/datalad/pull/4328 -Last-Update: 2020-03-22 - -From 2e2b1c8dfbb8179547444ef88d10b0849b40bd41 Mon Sep 17 00:00:00 2001 -From: Yaroslav Halchenko -Date: Sun, 22 Mar 2020 13:49:34 -0400 -Subject: [PATCH] BF: match "hint: " prefix for ignored files message by git - -Detected while trying to build package in Debian which has now git -2.26.0.rc2 which lead to the following failure - - ====================================================================== - ERROR: datalad.support.tests.test_gitrepo.test_GitRepo_gitignore - ---------------------------------------------------------------------- - Traceback (most recent call last): - File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest - self.test(*self.arg) - File "/build/datalad-0.12.4/.pybuild/cpython3_3.7_datalad/build/datalad/tests/utils.py", line 559, in newfunc - return t(*(arg + (d,)), **kw) - File "/build/datalad-0.12.4/.pybuild/cpython3_3.7_datalad/build/datalad/support/tests/test_gitrepo.py", line 1235, in test_GitRepo_gitignore - gr.add('ignore.me') - File "/build/datalad-0.12.4/.pybuild/cpython3_3.7_datalad/build/datalad/support/gitrepo.py", line 316, in newfunc - result = func(self, files_new, *args, **kwargs) - File "/build/datalad-0.12.4/.pybuild/cpython3_3.7_datalad/build/datalad/support/gitrepo.py", line 1161, in add - update=update)) - File "/build/datalad-0.12.4/.pybuild/cpython3_3.7_datalad/build/datalad/support/gitrepo.py", line 1192, in add_ - to_options(update=update) + ['--verbose'] - File "/build/datalad-0.12.4/.pybuild/cpython3_3.7_datalad/build/datalad/support/gitrepo.py", line 316, in newfunc - result = func(self, files_new, *args, **kwargs) - File "/build/datalad-0.12.4/.pybuild/cpython3_3.7_datalad/build/datalad/support/gitrepo.py", line 1896, in _git_custom_command - expect_fail=expect_fail) - File "/build/datalad-0.12.4/.pybuild/cpython3_3.7_datalad/build/datalad/support/gitrepo.py", line 1933, in _run_command_files_split - *args, **kwargs) - File "/build/datalad-0.12.4/.pybuild/cpython3_3.7_datalad/build/datalad/cmd.py", line 711, in run - cmd, env=self.get_git_environ_adjusted(env), *args, **kwargs) - File "/build/datalad-0.12.4/.pybuild/cpython3_3.7_datalad/build/datalad/cmd.py", line 544, in run - raise CommandError(str(cmd), msg, status, out[0], out[1]) - datalad.support.exceptions.CommandError: CommandError: command '['git', '-c', 'annex.largefiles=nothing', 'add', '--verbose', '--', 'ignore.me']' failed with exitcode 1 - Failed to run ['git', '-c', 'annex.largefiles=nothing', 'add', '--verbose', '--', 'ignore.me'] under '/tmp/datalad_temp_tree_test_GitRepo_gitignorednwh55rq'. Exit code=1. out= err=The following paths are ignored by one of your .gitignore files: - ignore.me - hint: Use -f if you really want to add them. - hint: Turn this message off by running - hint: "git config advice.addIgnoredFile false" - -which showed that new "hint: " prefix. In the solution I decided to stay as -strict (^) and was not sure if groups used for anything, so I had used non-capturing -group (?:) to capture that hint. ---- - datalad/support/exceptions.py | 3 ++- - 1 file changed, 2 insertions(+), 1 deletion(-) - -diff --git a/datalad/support/exceptions.py b/datalad/support/exceptions.py -index 603ea3f7a..45eda0014 100644 ---- a/datalad/support/exceptions.py -+++ b/datalad/support/exceptions.py -@@ -163,7 +163,8 @@ class GitIgnoreError(CommandError): - """ - - pattern = \ -- re.compile(r'ignored by one of your .gitignore files:\s*(.*)^Use -f.*$', -+ re.compile(r'ignored by one of your .gitignore files:\s*(.*)' -+ r'^(?:hint: )?Use -f.*$', - flags=re.MULTILINE | re.DOTALL) - - def __init__(self, cmd="", msg="", code=None, stdout="", stderr="", --- -2.25.1 - diff -Nru datalad-0.12.4/debian/patches/deb_no_utf8 datalad-0.12.6/debian/patches/deb_no_utf8 --- datalad-0.12.4/debian/patches/deb_no_utf8 2020-03-23 17:26:07.000000000 +0000 +++ datalad-0.12.6/debian/patches/deb_no_utf8 2020-04-23 20:43:05.000000000 +0000 @@ -71,8 +71,8 @@ @skip_if_on_windows # likely would fail --- a/datalad/tests/test_config.py +++ b/datalad/tests/test_config.py -@@ -45,7 +45,7 @@ user = name=Jane Doe - user = email=jd@example.com +@@ -52,7 +52,7 @@ novalue + empty = myint = 3 -[onemore "complicated の beast with.dot"] @@ -80,7 +80,7 @@ findme = 5.0 """ -@@ -71,16 +71,16 @@ def test_something(path, new_home): +@@ -78,16 +78,16 @@ def test_something(path, new_home): assert_true(cfg.has_section('something')) assert_false(cfg.has_section('somethingelse')) assert_equal(sorted(cfg.sections()), @@ -89,7 +89,7 @@ assert_true(cfg.has_option('something', 'user')) assert_false(cfg.has_option('something', 'us?er')) assert_false(cfg.has_option('some?thing', 'user')) - assert_equal(sorted(cfg.options('something')), ['myint', 'user']) + assert_equal(sorted(cfg.options('something')), ['empty', 'myint', 'novalue', 'user']) - assert_equal(cfg.options(u'onemore.complicated の beast with.dot'), ['findme']) + assert_equal(cfg.options(u'onemore.complicated nounicode beast with.dot'), ['findme']) @@ -97,10 +97,10 @@ sorted(cfg.items()), - [(u'onemore.complicated の beast with.dot.findme', '5.0'), + [(u'onemore.complicated nounicode beast with.dot.findme', '5.0'), + ('something.empty', ''), ('something.myint', '3'), - ('something.user', ('name=Jane Doe', 'email=jd@example.com'))]) - assert_equal( -@@ -93,7 +93,7 @@ def test_something(path, new_home): + ('something.novalue', None), +@@ -104,7 +104,7 @@ def test_something(path, new_home): cfg.get('something.user'), ('name=Jane Doe', 'email=jd@example.com')) assert_raises(KeyError, cfg.__getitem__, 'somedthing.user') @@ -108,8 +108,8 @@ + assert_equal(cfg.getfloat(u'onemore.complicated nounicode beast with.dot', 'findme'), 5.0) assert_equal(cfg.getint('something', 'myint'), 3) assert_equal(cfg.getbool('something', 'myint'), True) - assert_equal(cfg.getbool('doesnot', 'exist', default=True), True) -@@ -106,8 +106,8 @@ def test_something(path, new_home): + # git demands a key without value at all to be used as a flag, thus True +@@ -123,8 +123,8 @@ def test_something(path, new_home): assert_raises(KeyError, cfg.get_value, 'doesnot', 'exist', default=None) # modification follows @@ -249,7 +249,7 @@ # (ref: https://github.com/datalad/datalad/pull/1921#issuecomment-385809366) --- a/datalad/core/local/tests/test_diff.py +++ b/datalad/core/local/tests/test_diff.py -@@ -475,9 +475,12 @@ def test_diff_rsync_syntax(path): +@@ -472,9 +472,12 @@ def test_diff_rsync_syntax(path): @with_tempfile(mkdir=True) def test_diff_nonexistent_ref_unicode(path): diff -Nru datalad-0.12.4/debian/patches/series datalad-0.12.6/debian/patches/series --- datalad-0.12.4/debian/patches/series 2020-03-23 17:26:07.000000000 +0000 +++ datalad-0.12.6/debian/patches/series 2020-04-23 20:43:05.000000000 +0000 @@ -2,4 +2,3 @@ deb_setup_no_msgpack_and_duecredit deb_no_utf8 python3.patch -0001-BF-match-hint-prefix-for-ignored-files-message-by-gi.patch diff -Nru datalad-0.12.4/docs/source/changelog.rst datalad-0.12.6/docs/source/changelog.rst --- datalad-0.12.4/docs/source/changelog.rst 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/docs/source/changelog.rst 2020-04-23 18:42:28.000000000 +0000 @@ -15,6 +15,110 @@ We would recommend to consult log of the `DataLad git repository `__ for more details. +0.12.6 (April 23, 2020) – . +--------------------------- + +Major refactoring and deprecations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- The value of ``datalad.support.annexrep.N_AUTO_JOBS`` is no longer + considered. The variable will be removed in a later release. + (`#4409 `__) + +Fixes +~~~~~ + +- Staring with v0.12.0, ``datalad save`` recorded the current branch of + a parent dataset as the ``branch`` value in the .gitmodules entry for + a subdataset. This behavior is problematic for a few reasons and has + been reverted. + (`#4375 `__) + +- The default for the ``--jobs`` option, “auto”, instructed DataLad to + pass a value to git-annex’s ``--jobs`` equal to + ``min(8, max(3, ))``, which could lead to issues due + to the large number of child processes spawned and file descriptors + opened. To avoid this behavior, ``--jobs=auto`` now results in + git-annex being called with ``--jobs=1`` by default. Configure the + new option ``datalad.runtime.max-annex-jobs`` to control the maximum + value that will be considered when ``--jobs='auto'``. + (`#4409 `__) + +- Various commands have been adjusted to better handle the case where a + remote’s HEAD ref points to an unborn branch. + (`#4370 `__) + +- `search `__ + + - learned to use the query as a regular expression that restricts + the keys that are shown for ``--show-keys short``. + (`#4354 `__) + - gives a more helpful message when query is an invalid regular + expression. + (`#4398 `__) + +- The code for parsing Git configuration did not follow Git’s behavior + of accepting a key with no value as shorthand for key=true. + (`#4421 `__) + +- ``AnnexRepo.info`` needed a compatibility update for a change in how + git-annex reports file names. + (`#4431 `__) + +- `create-sibling-github `__ + did not gracefully handle a token that did not have the necessary + permissions. + (`#4400 `__) + +Enhancements and new features +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- `search `__ + learned to use the query as a regular expression that restricts the + keys that are shown for ``--show-keys short``. + (`#4354 `__) + +- ``datalad `` learned to point to the + `datalad-container `__ + extension when a subcommand from that extension is given but the + extension is not installed. + (`#4400 `__) + (`#4174 `__) + +0.12.5 (Apr 02, 2020) – a small step for datalad … +-------------------------------------------------- + + Fix some bugs and make the world an even better place. + +.. _fixes-1: + +Fixes +~~~~~ + +- Our ``log_progress`` helper mishandled the initial display and step + of the progress bar. + (`#4326 `__) + +- ``AnnexRepo.get_content_annexinfo`` is designed to accept + ``init=None``, but passing that led to an error. + (`#4330 `__) + +- Update a regular expression to handle an output change in Git + v2.26.0. (`#4328 `__) + +- We now set ``LC_MESSAGES`` to ‘C’ while running git to avoid failures + when parsing output that is marked for translation. + (`#4342 `__) + +- The helper for decoding JSON streams loaded the last line of input + without decoding it if the line didn’t end with a new line, a + regression introduced in the 0.12.0 release. + (`#4361 `__) + +- The clone command failed to git-annex-init a fresh clone whenever it + considered to add the origin of the origin as a remote. + (`#4367 `__) + 0.12.4 (Mar 19, 2020) – Windows?! --------------------------------- @@ -22,11 +126,15 @@ associated wheel to enable a working installation on Windows (`#4315 `__). +.. _fixes-2: + Fixes ~~~~~ -- Adjust the behavior of the ``log.outputs`` config switch to make - outputs visible. Its description was adjusted accordingly. +- The description of the ``log.outputs`` config switch did not keep up + with code changes and incorrectly stated that the output would be + logged at the DEBUG level; logging actually happens at a lower level. + (`#4317 `__) 0.12.3 (March 16, 2020) – . --------------------------- @@ -34,6 +142,8 @@ Updates for compatibility with the latest git-annex, along with a few miscellaneous fixes +.. _major-refactoring-and-deprecations-1: + Major refactoring and deprecations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -44,7 +154,7 @@ should prefer the latter. (`#4285 `__) -.. _fixes-1: +.. _fixes-3: Fixes ~~~~~ @@ -80,6 +190,8 @@ connections but failed to do so. (`#4262 `__) +.. _enhancements-and-new-features-1: + Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -100,7 +212,7 @@ Mostly a bugfix release with various robustifications, but also makes the first step towards versioned dataset installation requests. -.. _major-refactoring-and-deprecations-1: +.. _major-refactoring-and-deprecations-2: Major refactoring and deprecations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -108,7 +220,7 @@ - The minimum required version for GitPython is now 2.1.12. (`#4070 `__) -.. _fixes-2: +.. _fixes-4: Fixes ~~~~~ @@ -144,7 +256,7 @@ some scenarios. (`#4060 `__) -.. _enhancements-and-new-features-1: +.. _enhancements-and-new-features-2: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -182,7 +294,7 @@ Fix some fallout after major release. -.. _fixes-3: +.. _fixes-5: Fixes ~~~~~ @@ -528,7 +640,7 @@ bet we will fix some bugs and make a world even a better place. -.. _major-refactoring-and-deprecations-2: +.. _major-refactoring-and-deprecations-3: Major refactoring and deprecations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -588,7 +700,7 @@ - The ``rev_resolve_path`` substituted ``resolve_path`` helper. (`#3797 `__) -.. _fixes-4: +.. _fixes-6: Fixes ~~~~~ @@ -651,7 +763,7 @@ different drive letters. (`#3728 `__) -.. _enhancements-and-new-features-2: +.. _enhancements-and-new-features-3: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -724,7 +836,7 @@ Various fixes and enhancements that bring the 0.12.0 release closer. -.. _major-refactoring-and-deprecations-3: +.. _major-refactoring-and-deprecations-4: Major refactoring and deprecations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -765,7 +877,7 @@ ``unlock`` and ``addurls``, follow the new logic. The goal is for all commands to eventually do so. -.. _fixes-5: +.. _fixes-7: Fixes ~~~~~ @@ -819,7 +931,7 @@ arguments to avoid exceeding the command-line character limit. (`#3587 `__) -.. _enhancements-and-new-features-3: +.. _enhancements-and-new-features-4: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -889,7 +1001,7 @@ - The ``add`` command is now deprecated. It will be removed in a future release. -.. _fixes-6: +.. _fixes-8: Fixes ~~~~~ @@ -906,7 +1018,7 @@ exists yet (`#3403 `__) -.. _enhancements-and-new-features-4: +.. _enhancements-and-new-features-5: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -922,7 +1034,7 @@  Continues API consolidation and replaces the ``create`` and ``diff`` command with more performant implementations. -.. _major-refactoring-and-deprecations-4: +.. _major-refactoring-and-deprecations-5: Major refactoring and deprecations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -955,7 +1067,7 @@ - ``AnnexRepo.get_status`` has been replaced by ``AnnexRepo.status``. (`#3330 `__) -.. _fixes-7: +.. _fixes-9: Fixes ~~~~~ @@ -984,7 +1096,7 @@ - The new pathlib-based code had various encoding issues on Python 2. (`#3332 `__) -.. _enhancements-and-new-features-5: +.. _enhancements-and-new-features-6: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1063,7 +1175,7 @@ 0.12.0rc2 (Mar 18, 2019) – revolution! -------------------------------------- -.. _fixes-8: +.. _fixes-10: Fixes ~~~~~ @@ -1073,7 +1185,7 @@ - ``GitRepo.save()`` reports results on deleted files. -.. _enhancements-and-new-features-6: +.. _enhancements-and-new-features-7: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1091,7 +1203,7 @@ 0.12.0rc1 (Mar 03, 2019) – to boldly go … ----------------------------------------- -.. _major-refactoring-and-deprecations-5: +.. _major-refactoring-and-deprecations-6: Major refactoring and deprecations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1099,7 +1211,7 @@ - Discontinued support for git-annex direct-mode (also no longer supported upstream). -.. _enhancements-and-new-features-7: +.. _enhancements-and-new-features-8: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1113,7 +1225,7 @@ 0.11.8 (Oct 11, 2019) – annex-we-are-catching-up ------------------------------------------------ -.. _fixes-9: +.. _fixes-11: Fixes ~~~~~ @@ -1125,7 +1237,7 @@ (`#3769 `__) (`#3770 `__) -.. _enhancements-and-new-features-8: +.. _enhancements-and-new-features-9: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1152,7 +1264,7 @@ Primarily bugfixes with some optimizations and refactorings. -.. _fixes-10: +.. _fixes-12: Fixes ~~~~~ @@ -1196,7 +1308,7 @@ now will create leading directories of the output path if they do not exist (`#3646 `__) -.. _enhancements-and-new-features-9: +.. _enhancements-and-new-features-10: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1232,7 +1344,7 @@ Primarily bug fixes to achieve more robust performance -.. _fixes-11: +.. _fixes-13: Fixes ~~~~~ @@ -1265,7 +1377,7 @@ the remote not being enabled. (`#3547 `__) -.. _enhancements-and-new-features-10: +.. _enhancements-and-new-features-11: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1294,7 +1406,7 @@ Should be faster and less buggy, with a few enhancements. -.. _fixes-12: +.. _fixes-14: Fixes ~~~~~ @@ -1334,7 +1446,7 @@ - The detection of SSH RIs has been improved. (`#3425 `__) -.. _enhancements-and-new-features-11: +.. _enhancements-and-new-features-12: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1397,7 +1509,7 @@ crippled (no symlinks and no locking) filesystems. v7 repositories should be used instead. -.. _fixes-13: +.. _fixes-15: Fixes ~~~~~ @@ -1447,7 +1559,7 @@ ``.isatty``. (`#3268 `__) -.. _enhancements-and-new-features-12: +.. _enhancements-and-new-features-13: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1490,7 +1602,7 @@ Just a few of important fixes and minor enhancements. -.. _fixes-14: +.. _fixes-16: Fixes ~~~~~ @@ -1508,7 +1620,7 @@ to avoid these failures. (`#3164 `__) -.. _enhancements-and-new-features-13: +.. _enhancements-and-new-features-14: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1530,7 +1642,7 @@ A variety of bugfixes and enhancements -.. _major-refactoring-and-deprecations-6: +.. _major-refactoring-and-deprecations-7: Major refactoring and deprecations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1541,7 +1653,7 @@ - The function ``datalad.cmd.get_runner`` has been removed. (`#3104 `__) -.. _fixes-15: +.. _fixes-17: Fixes ~~~~~ @@ -1603,7 +1715,7 @@ - Pass ``GIT_SSH_VARIANT=ssh`` to git processes to be able to specify alternative ports in SSH urls -.. _enhancements-and-new-features-14: +.. _enhancements-and-new-features-15: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1667,7 +1779,7 @@ `git-annex `__ which introduced v7 to replace v6. -.. _fixes-16: +.. _fixes-18: Fixes ~~~~~ @@ -1715,7 +1827,7 @@ (`#2960 `__) (`#2950 `__) -.. _enhancements-and-new-features-15: +.. _enhancements-and-new-features-16: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1752,7 +1864,7 @@ `git-annex `__ 6.20180913 (or later) is now required - provides a number of fixes for v6 mode operations etc. -.. _major-refactoring-and-deprecations-7: +.. _major-refactoring-and-deprecations-8: Major refactoring and deprecations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1779,7 +1891,7 @@ instead of ``treeishes`` (`#2903 `__) -.. _fixes-17: +.. _fixes-19: Fixes ~~~~~ @@ -1828,7 +1940,7 @@ paths when called more than once (`#2921 `__) -.. _enhancements-and-new-features-16: +.. _enhancements-and-new-features-17: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1880,7 +1992,7 @@ sure that you are using a recent ``git-annex`` since it also had a variety of fixes and enhancements in the past months. -.. _fixes-18: +.. _fixes-20: Fixes ~~~~~ @@ -1943,7 +2055,7 @@ error message now. (`#2815 `__) -.. _enhancements-and-new-features-17: +.. _enhancements-and-new-features-18: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2011,7 +2123,7 @@ forbidding file:// and http://localhost/ URLs which might lead to revealing private files if annex is publicly shared. -.. _fixes-19: +.. _fixes-21: Fixes ~~~~~ @@ -2021,7 +2133,7 @@ will now download to current directory instead of the top of the dataset -.. _enhancements-and-new-features-18: +.. _enhancements-and-new-features-19: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2042,7 +2154,7 @@ The is a minor bugfix release. -.. _fixes-20: +.. _fixes-22: Fixes ~~~~~ @@ -2057,7 +2169,7 @@ This release is a major leap forward in metadata support. -.. _major-refactoring-and-deprecations-8: +.. _major-refactoring-and-deprecations-9: Major refactoring and deprecations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2088,7 +2200,7 @@ - By default a dataset X is now only considered to be a super-dataset of another dataset Y, if Y is also a registered subdataset of X. -.. _fixes-21: +.. _fixes-23: Fixes ~~~~~ @@ -2113,7 +2225,7 @@ - More robust URL handling in ``simple_with_archives`` crawler pipeline. -.. _enhancements-and-new-features-19: +.. _enhancements-and-new-features-20: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2195,7 +2307,7 @@ Some important bug fixes which should improve usability -.. _fixes-22: +.. _fixes-24: Fixes ~~~~~ @@ -2211,7 +2323,7 @@ “git mv”ed, so you can now ``datalad run git mv old new`` and have changes recorded -.. _enhancements-and-new-features-20: +.. _enhancements-and-new-features-21: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2229,7 +2341,7 @@ Largely a bugfix release with a few enhancements. -.. _fixes-23: +.. _fixes-25: Fixes ~~~~~ @@ -2256,7 +2368,7 @@ - Assure that extracted from tarballs directories have executable bit set -.. _enhancements-and-new-features-21: +.. _enhancements-and-new-features-22: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2295,7 +2407,7 @@ Minor bugfix release -.. _fixes-24: +.. _fixes-26: Fixes ~~~~~ @@ -2308,7 +2420,7 @@ 0.9.0 (Sep 19, 2017) – isn’t it a lucky day even though not a Friday? --------------------------------------------------------------------- -.. _major-refactoring-and-deprecations-9: +.. _major-refactoring-and-deprecations-10: Major refactoring and deprecations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2328,7 +2440,7 @@ `publish `__ now transfers data before repository content is pushed. -.. _fixes-25: +.. _fixes-27: Fixes ~~~~~ @@ -2359,7 +2471,7 @@ - crawl templates should not now override settings for ``largefiles`` if specified in ``.gitattributes`` -.. _enhancements-and-new-features-22: +.. _enhancements-and-new-features-23: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2400,7 +2512,7 @@ Bugfixes -.. _fixes-26: +.. _fixes-28: Fixes ~~~~~ @@ -2417,7 +2529,7 @@ - More robust handling of unicode output in terminals which might not support it -.. _enhancements-and-new-features-23: +.. _enhancements-and-new-features-24: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2437,7 +2549,7 @@ A variety of fixes and enhancements -.. _fixes-27: +.. _fixes-29: Fixes ~~~~~ @@ -2452,7 +2564,7 @@ should better tollerate publishing to pure git and ``git-annex`` special remotes -.. _enhancements-and-new-features-24: +.. _enhancements-and-new-features-25: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2477,7 +2589,7 @@ New features, refactorings, and bug fixes. -.. _major-refactoring-and-deprecations-10: +.. _major-refactoring-and-deprecations-11: Major refactoring and deprecations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2492,7 +2604,7 @@ have been re-written to support the same common API as most other commands -.. _enhancements-and-new-features-25: +.. _enhancements-and-new-features-26: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2512,7 +2624,7 @@ - Significant parts of the documentation of been updated - Instantiate GitPython’s Repo instances lazily -.. _fixes-28: +.. _fixes-30: Fixes ~~~~~ @@ -2539,7 +2651,7 @@ - input paths/arguments analysis was redone for majority of the commands to provide unified behavior -.. _major-refactoring-and-deprecations-11: +.. _major-refactoring-and-deprecations-12: Major refactoring and deprecations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2550,7 +2662,7 @@ - ‘datalad.api.alwaysrender’ config setting/support is removed in favor of new outputs processing -.. _fixes-29: +.. _fixes-31: Fixes ~~~~~ @@ -2565,7 +2677,7 @@ closed `__ for more information -.. _enhancements-and-new-features-26: +.. _enhancements-and-new-features-27: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2617,7 +2729,7 @@ A bugfix release -.. _fixes-30: +.. _fixes-32: Fixes ~~~~~ @@ -2638,7 +2750,7 @@ speeds - should provide progress reports while using Python 3.x -.. _enhancements-and-new-features-27: +.. _enhancements-and-new-features-28: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2706,7 +2818,7 @@ `create-sibling `__ ``--inherit`` -.. _fixes-31: +.. _fixes-33: Fixes ~~~~~ @@ -2720,7 +2832,7 @@ operation outside of the datasets - A number of fixes for direct and v6 mode of annex -.. _enhancements-and-new-features-28: +.. _enhancements-and-new-features-29: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2755,7 +2867,7 @@ Requires now GitPython >= 2.1.0 -.. _fixes-32: +.. _fixes-34: Fixes ~~~~~ @@ -2770,7 +2882,7 @@ - do not log calls to ``git config`` to avoid leakage of possibly sensitive settings to the logs -.. _enhancements-and-new-features-29: +.. _enhancements-and-new-features-30: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2806,7 +2918,7 @@ `get `__ implementation, it gets a new minor release. -.. _fixes-33: +.. _fixes-35: Fixes ~~~~~ @@ -2820,7 +2932,7 @@ - robust detection of outdated `git-annex `__ -.. _enhancements-and-new-features-30: +.. _enhancements-and-new-features-31: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2849,7 +2961,7 @@ Primarily bugfixes but also a number of enhancements and core refactorings -.. _fixes-34: +.. _fixes-36: Fixes ~~~~~ @@ -2859,7 +2971,7 @@ - `install `__ can be called on already installed dataset (with ``-r`` or ``-g``) -.. _enhancements-and-new-features-31: +.. _enhancements-and-new-features-32: Enhancements and new features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff -Nru datalad-0.12.4/docs/source/config.rst datalad-0.12.6/docs/source/config.rst --- datalad-0.12.4/docs/source/config.rst 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/docs/source/config.rst 2020-04-23 18:42:28.000000000 +0000 @@ -33,19 +33,19 @@ Global user configuration ========================= -.. include:: generated/cfginfo/global.rst +.. include:: generated/cfginfo/global.rst.in Local repository configuration ============================== -.. include:: generated/cfginfo/local.rst +.. include:: generated/cfginfo/local.rst.in Sticky dataset configuration ============================= -.. include:: generated/cfginfo/dataset.rst +.. include:: generated/cfginfo/dataset.rst.in Miscellaneous configuration =========================== -.. include:: generated/cfginfo/misc.rst +.. include:: generated/cfginfo/misc.rst.in diff -Nru datalad-0.12.4/docs/source/conf.py datalad-0.12.6/docs/source/conf.py --- datalad-0.12.4/docs/source/conf.py 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/docs/source/conf.py 2020-04-23 18:42:28.000000000 +0000 @@ -24,7 +24,13 @@ sys.path.insert(0, os.path.abspath('utils')) # travis sys.path.insert(0, os.path.abspath(opj(pardir, 'utils'))) # RTD from pygments_ansi_color import AnsiColorLexer - sphinx.add_lexer("ansi-color", AnsiColorLexer()) + # As of Sphinx v2.1, passing an instance is deprecated. + # TODO: Remove when minimum sphinx version is at least 2.1. + import sphinx as sphinx_mod + sphinx_ver = int(sphinx_mod.__version__.split('.')[0]) + if sphinx_ver < 3: # Check against 3 rather than 2.1 for simplicity. + AnsiColorLexer = AnsiColorLexer() + sphinx.add_lexer("ansi-color", AnsiColorLexer) # If extensions (or modules to document with autodoc) are in another directory, diff -Nru datalad-0.12.4/.gitmodules datalad-0.12.6/.gitmodules --- datalad-0.12.4/.gitmodules 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/.gitmodules 2020-04-23 18:42:28.000000000 +0000 @@ -1,3 +0,0 @@ -[submodule ".asv"] - path = .asv - url = https://github.com/datalad/.asv.git diff -Nru datalad-0.12.4/Makefile datalad-0.12.6/Makefile --- datalad-0.12.4/Makefile 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/Makefile 2020-04-23 18:42:28.000000000 +0000 @@ -39,11 +39,13 @@ tools/link_issues_CHANGELOG update-changelog: linkissues-changelog + # test if the changelog still contains a release placeholder + git grep -q '^##.*??? ??' -- CHANGELOG.md && exit 1 || true @echo ".. This file is auto-converted from CHANGELOG.md (make update-changelog) -- do not edit\n\nChange log\n**********" > docs/source/changelog.rst pandoc -t rst CHANGELOG.md >> docs/source/changelog.rst release-pypi: update-changelog - # better safe than sorry + # avoid upload of stale builds test ! -e dist $(PYTHON) setup.py sdist # the wheels we would produce are broken on windows, because they diff -Nru datalad-0.12.4/.travis.yml datalad-0.12.6/.travis.yml --- datalad-0.12.4/.travis.yml 2020-03-19 07:27:34.000000000 +0000 +++ datalad-0.12.6/.travis.yml 2020-04-23 18:42:28.000000000 +0000 @@ -40,6 +40,9 @@ # We cannot have empty -A selector, so the one which always will be fulfilled - NOSE_SELECTION= - NOSE_SELECTION_OP=not + # To test https://github.com/datalad/datalad/pull/4342 fix. + # From our testing in that PR seems to have no effect, but kept around since should not hurt. + - LC_ALL=ru_RU.UTF-8 - python: 3.6 # Single run for Python 3.6 env: @@ -56,6 +59,9 @@ env: - DATALAD_REPO_VERSION=6 - NOSE_SELECTION_OP="" + # To test https://github.com/datalad/datalad/pull/4342 fix in case of no "not" for NOSE. + # From our testing in that PR seems to have no effect, but kept around since should not hurt. + - LANG=bg_BG.UTF-8 - python: 3.5 # Run slow etc tests under a single tricky scenario env: