Why does an Openshift Data Foundation upgrade fail with the message `failed to find valid VolumeSource for PVC <pvc-name>`?

Solution Verified - Updated -

Issue

  • During a major Openshift Data Foundation (ODF) upgrade, some OSDs don't get upgraded. The following logs are seen in the output of the rook-ceph-operator pod:

    failed to update OSD 6: failed to generate config for OSD X on PVC "<pvc-name>": failed to find valid VolumeSource for PVC "<pvc-name>"
    
  • There's a mismatch between the OSDs in the output of the ceph versions command. For instructions on how to access the Ceph command line in ODF, please refer to the KCS article 4870821 - Accessing the Red Hat Ceph Storage CLI in OpenShift Data Foundation 4.x:

    NAMESPACE=openshift-storage;ROOK_POD=$(oc -n ${NAMESPACE} get pod -l app=rook-ceph-operator -o jsonpath='{.items[0].metadata.name}');oc exec -it ${ROOK_POD} -n ${NAMESPACE} -- ceph versions  --cluster=${NAMESPACE} --conf=/var/lib/rook/${NAMESPACE}/${NAMESPACE}.config --keyring=/var/lib/rook/${NAMESPACE}/client.admin.keyring
    
    {
        "mon": {
            "ceph version 18.2.1-278.el9cp (2ae16095654f99a1a043ca3f0c7befcb78080058) reef (stable)": 3
        },
        "mgr": {
            "ceph version 18.2.1-278.el9cp (2ae16095654f99a1a043ca3f0c7befcb78080058) reef (stable)": 2
        },
        "osd": {
            "ceph version 18.2.1-229.el9cp (ef652b206f2487adfc86613646a4cac946f6b4e0) reef (stable)": 8, ---->
            "ceph version 18.2.1-278.el9cp (2ae16095654f99a1a043ca3f0c7befcb78080058) reef (stable)": 4  ---->
        },
        "mds": {
            "ceph version 18.2.1-278.el9cp (2ae16095654f99a1a043ca3f0c7befcb78080058) reef (stable)": 2
        },
        "rgw": {
            "ceph version 18.2.1-278.el9cp (2ae16095654f99a1a043ca3f0c7befcb78080058) reef (stable)": 1
        },
        "overall": {
            "ceph version 18.2.1-229.el9cp (ef652b206f2487adfc86613646a4cac946f6b4e0) reef (stable)": 8,
            "ceph version 18.2.1-278.el9cp (2ae16095654f99a1a043ca3f0c7befcb78080058) reef (stable)": 12
    

    From above, 8 OSDs are still running a previous Ceph version, 18.2.1-229.el9cp.

  • Why is this issue observed? How to upgrade the 8 OSDs that are still running an older Ceph version?

Environment

  • Red Hat Openshift Data Foundation version 4.x with flexible scaling enabled.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content