Pulp yum importer: NoneType object has no attribute 'findall'

Problem:
Syncing yum repos to the proxies always fails to complete if it can’t process the comps.xml properly.
It is related to the problem:
CentOS 8 BaseOS thinks it is AppStream
Because the AppStream’s repo data ends up in the wrong place, the replication doesn’t find it and pulp fails.

Expected outcome:
obviously the solution is to handle the variants properly.
But pulp yum importer needs to be able to handle unexpected cases (like packagelist is None) rather than “crashing”…because then is also doesn’t complete the remaining sync steps to were meant to be executed after this

Foreman and Proxy versions:
foreman 2.3.2
katello 3.18.1

Foreman and Proxy plugin versions:
pulp-rpm-plugins-2.21.5-1.el7.noarch

Distribution and version:
centos 7

Other relevant data:

"progress_report"=>
 {"yum_importer"=>
   {"content"=>
     {"size_total"=>0,
      "items_left"=>0,
      "items_total"=>0,
      "state"=>"FAILED",
      "size_left"=>0,
      "details"=>
       {"rpm_total"=>0, "rpm_done"=>0, "drpm_total"=>0, "drpm_done"=>0},
      "error"=>"'NoneType' object has no attribute 'findall'",
      "error_details"=>[]},
    "comps"=>{"state"=>"NOT_STARTED"},
    "purge_duplicates"=>{"state"=>"NOT_STARTED"},
    "distribution"=>
     {"items_total"=>0,
      "state"=>"NOT_STARTED",
      "error_details"=>[],
      "items_left"=>0},
    "modules"=>{"state"=>"NOT_STARTED"},
    "errata"=>{"state"=>"NOT_STARTED"},
    "metadata"=>{"state"=>"FINISHED"}}},
"queue"=>"reserved_resource_worker-1@m21vmfrm.dq2",
"state"=>"error",
"worker_name"=>"reserved_resource_worker-1@m21vmfrm",
"result"=>nil,
"error"=>
 {"code"=>"PLP0000",
  "data"=>{},
  "description"=>"Importer indicated a failed response",
  "sub_errors"=>[]},
"_id"=>{"$oid"=>"6037f739555c2f6581c89f8f"},
"id"=>"6037f739555c2f6581c89f8f"}],

“poll_attempts”=>{“total”=>32, “failed”=>1}}

the issue is in:
/usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/yum/repomd/group.py

def process_group_element(repo_id, element):
    """
    Process one XML block from comps.xml and return a models.PackageGroup instance

    :param repo_id: unique ID for the destination repository
    :type  repo_id  basestring
    :param element: object representing one "group" block from the XML file
    :type  element: xml.etree.ElementTree.Element

    :return:    models.PackageGroup instance for the XML block
    :rtype:     pulp_rpm.plugins.db.models.PackageGroup
    """
    packagelist = element.find('packagelist')
    conditional, default, mandatory, optional = _parse_packagelist(
        packagelist.findall('packagereq'))

Hi, I have same problem with CENTOS 7, is there any fix for this I’m unable to sync to my proxy servers.

Hey @sinewave :slight_smile:

are you on the same versions etc?

Hi, I had the same exact issue.
It’s documented here : Issue #8713: Pulp 3 to Pulp 2 sync fails if comps.xml has a group with an empty packagelist - RPM Support - Pulp
The solution is also provided in it.

This is due to the fact that you’re providing pulp3 content to pulp2 smart proxies, when doing that, pulp3 doesn’t provides groups which makes the pulp2 shit the bed.

Basically, just backup /usr/lib/python3.6/site-packages/pulp_rpm/app/tasks/publishing.py and make the correction described in the pulp issue :

diff --git a/pulp_rpm/app/tasks/publishing.py b/pulp_rpm/app/tasks/publishing.py
index 1604a48..3528808 100644
--- a/pulp_rpm/app/tasks/publishing.py
+++ b/pulp_rpm/app/tasks/publishing.py
@@ -501,7 +501,7 @@ def create_repomd_xml(
 
     comps.toxml_f(
         comps_xml_path,
-        xml_options={"default_explicit": True, "empty_groups": True, "uservisible_explicit": True},
+        xml_options={"default_explicit": True, "empty_groups": False, "uservisible_explicit": True},
     )
 
     pri_xml.close()

Then I republished all my content views and regenerated the metadata (not sure if it was necessary to do both)

Thanks, Its not working for me
Versions I’m using are very similar to @simonmcq

  • foreman-2.3.5-1.el7.noarch
  • katello-3.18.3-1.el7.noarch
  • pulp-server-2.21.5-1.el7.noarch

but interestingly I can see on the Backend System Status
there is reference to PULP 3 OK I never noticed this before as I am very sure I only installed pulp2. is there something going on under the covers ?

I’ve made the change reference above republished and regenerated not sure if it the same thing.

I edited the publishing.py publishing.py.bk
xml_options={“default_explicit”: True, “empty_groups”: True, “uservisible_explicit”: True},

did a republish using hammer as “republish” doesnt seem to be a GUI thing
Did the regenerate metadata in foreman GUI

I did a sync of a content view and was succesfull. I was so delighted.
Unfortunately when I added a real lifecycle
I’m still getting errors:

I get this one :
Katello::Errors::PulpError: PLP0000: Importer indicated a failed response
“Traceback (most recent call last):\n” +
" File “/usr/lib/python2.7/site-packages/pulp/server/controllers/repository.py”, line 860, in sync\n" +
" raise pulp_exceptions.PulpExecutionException(_(‘Importer indicated a failed response’))\n" +
"PulpExecutionException: Importer indicated a failed response\n",

{“rpm_total”=>0, “rpm_done”=>0, “drpm_total”=>0, “drpm_done”=>0},
"error"=>"‘NoneType’ object has no attribute ‘findall’",
“error_details”=>},
“comps”=>{“state”=>“NOT_STARTED”},
“purge_duplicates”=>{“state”=>“NOT_STARTED”},

I also getting a metadata error:
raise PulpCodedException(error_code=error_codes.RPM1004, reason=reason)\n" +
“PulpCodedException: Error retrieving metadata: Error ‘Proxy Error’ for https:///pulp/repos/*/Spacewalk/Foreman2_3_Katello_EL6/custom/Foreman2_3_Katello_EL6/Foreman2_3_Katello_EL6/repodata/c0cace4d6bd977ad29096c46467682b30b7ef65aff60fe7f7238096f9003d898-updateinfo.xml.gz.\n”,


“reason”=>
“Error ‘Proxy Error’ for https://****/pulp/repos/*****/Spacewalk/Foreman2_3_Katello_EL6/custom/Foreman2_3_Katello_EL6/Foreman2_3_Katello_EL6/repodata/c0cace4d6bd977ad29096c46467682b30b7ef65aff60fe7f7238096f9003d898-updateinfo.xml.gz.”},
“description”=>
"Error retrieving metadata: Error ’

there is nothing wrong with that xml file that I can make out server has access to it can download it with a wget. I’ve regenerated the metadata many times.

Any Ideas ?

Sorry just realised I pasted the wrong text for the publishing.py I posed my backup copy.
comps.toxml_f(
comps_xml_path,
xml_options={“default_explicit”: True, “empty_groups”: False, “uservisible_explicit”: True},
)