Zaɓi Harshe

Compute4PUNCH & Storage4PUNCH: Tsarin Kayayyakin Ƙididdiga da Ajiya na Tarayya don Kimiyyar Barbashi, Taurari, da Nukiliya

Bincike kan ra'ayoyin tsarin kayayyakin ƙididdiga da ajiya na tarayya na ƙungiyar PUNCH4NFDI, wanda ke haɗa nau'ikan albarkatun HPC, HTC, da girgije daga ko'ina cikin Jamus.
computepowertoken.com | PDF Size: 0.5 MB
Kima: 4.5/5
Kimarku
Kun riga kun ƙididdige wannan takarda
Murfin Takardar PDF - Compute4PUNCH & Storage4PUNCH: Tsarin Kayayyakin Ƙididdiga da Ajiya na Tarayya don Kimiyyar Barbashi, Taurari, da Nukiliya

1. Gabatarwa

Ƙungiyar PUNCH4NFDI (Barbashi, Sararin Samaniya, Nukiliya da Hadrons don Tsarin Bayanan Bincike na Ƙasa), wacce Hukumar Bincike ta Jamus (DFG) ta ba da kuɗi, tana wakiltar kimanin masana kimiyya 9,000 daga ƙungiyoyin kimiyyar barbashi, taurari, barbashi-taurari, hadron, da nukiliya a Jamus. An saka ta cikin shirin NFDI na ƙasa, babban manufarta ita ce kafa dandalin bayanan kimiyya na tarayya da FAIR (Ana iya Gano su, Samun su, Haɗin kai, Sake Amfani da su). Wannan dandalin yana da nufin samar da damar shiga cikin sauƙi ga nau'ikan albarkatun ƙididdiga da ajiya iri-iri waɗanda cibiyoyin membobinta suka ba da gudummawa, tare da magance ƙalubalen gama gari na binciken girman bayanai masu ƙaruwa da ƙima tare da ƙa'idodi masu sarƙaƙiya. Wannan takarda ta yi cikakken bayani game da ra'ayoyin Compute4PUNCH da Storage4PUNCH waɗanda aka ƙera don haɗa waɗannan albarkatun.

2. Tsarin Kayayyakin Ƙididdiga na Tarayya iri-iri – Compute4PUNCH

Compute4PUNCH yana magance ƙalubalen yin amfani da inganci da ɗimbin albarkatun ƙididdiga masu ƙarfin aiki (HTC), ƙididdiga masu ƙarfi (HPC), da albarkatun girgije waɗanda aka ba da gudummawar su a duk faɗin Jamus. Waɗannan albarkatun sun bambanta a tsarin gine-gine, OS, software, da tabbatar da ainihi, kuma suna aiki don wasu dalilai, yana iyakance iyakar gyara.

2.1 Tsarin Tsakiya & Fasahohi

An cim ma tarayyar ta hanyar tsarin rufi na tsara ayyuka. Fasahohin tsakiya sune:

  • HTCondor: Ya zama ginshiƙin tsarin ayyuka na tarayya, yana sarrafa jerin ayyuka da daidaita albarkatu a cikin tafkin nau'ikan iri-iri.
  • COBalD/TARDIS: Yana aiki azaman mai tsara albarkatun. Yana haɗa albarkatun waje (misali, daga cibiyoyin HPC ko girgije) cikin sauƙi da bayyane cikin tafkin HTCondor. TARDIS yana "fassara" buƙatun aikin HTCondor zuwa umarni don APIs na albarkatun waje (kamar OpenStack ko Slurm), yayin da COBalD ke yanke shawara kan lokacin da za a sami ko saki waɗannan albarkatun waje bisa farashi da buƙata, yana inganta aikin amfani $U(R, C)$ inda $R$ ke nufin aikin albarkatu kuma $C$ farashi.
  • Tsarin Tabbatar da Ainihi da Izinin (AAI) na Alama: Yana ba da daidaitaccen shiga mai tsaro a duk albarkatun, yana rage buƙatar asusun mai amfani ɗaya a kowane tsarin.
  • CVMFS (Tsarin Fayil na Na'ura Mai Ƙwaƙwalwa ta CERN) & Kwantena: Suna tabbatar da samar da yanayin software na musamman na al'umma. CVMFS tana isar da ma'ajiyar software, yayin da fasahohin kwantena (misali, Docker, Singularity) ke samar da yanayin aiki keɓantacce, mai maimaitawa, suna magance matsalar dogaro da software a cikin tsare-tsare iri-iri.

2.2 Shiga & Fuskar Mai Amfani

An tsara wuraren shiga na mai amfani don sauƙin amfani:

  • Nodes na Shiga na Al'ada: Suna ba da fuskar umarni da aka saba da ita don masu amfani masu ƙwarewa.
  • JupyterHub: Yana ba da yanayin ƙididdiga mai ma'amala ta yanar gizo (notebooks), yana rage matakin binciken bayanai da bincike.

Dukansu fuskar suna ba da damar shiga cikin dukkan yanayin ƙididdiga na tarayya, suna ɓoye rikitaccen abin da ke ƙasa.

3. Tsarin Kayayyakin Ajiya na Tarayya – Storage4PUNCH

Storage4PUNCH yana mai da hankali kan haɗa tsarin ajiya da al'umma ke samarwa, musamman bisa fasahohin dCache da XRootD, waɗanda suka kafu sosai a cikin Kimiyyar Ƙarfi (HEP). Tarayyar tana ƙirƙirar sunan sarari gama gari da matakin shiga. Ra'ayin kuma yana kimanta fasahohin da ake da su don:

  • Ƙwaƙwalwar Ajiya (Caching): Don inganta jinkirin samun bayanai da rage zirga-zirgar WAN, kama da ra'ayoyin da ake amfani da su a cikin grid ɗin bayanai na duniya kamar Grid ɗin Ƙididdiga na LHC na Duniya (WLCG).
  • Sarrafa Metadata: Da nufin haɗin kai mai zurfi don ba da damar gano bayanai bisa halayen metadata, wucewa daga wurin fayil mai sauƙi.

Haɗaɗɗen yanayin Compute4PUNCH da Storage4PUNCH yana ba masu bincike damar aiwatar da ayyukan bincike masu buƙatar albarkatu waɗanda ke buƙatar haɗin kai don samun damar ƙarfin ƙididdiga da manyan bayanai.

4. Cikakkun Bayanai na Fasaha & Tsarin Lissafi

Tsarin ayyukan albarkatu ta COBalD/TARDIS ana iya ƙirƙira shi azaman matsalar ingantawa. Bari $J = \{j_1, j_2, ..., j_n\}$ ya zama jerin ayyuka a cikin jerin HTCondor, kuma $P = \{p_1, p_2, ..., p_m\}$ ya zama tafkin albarkatun da ake da su (na cikin gida da na waje). Kowane aiki $j_i$ yana da buƙatu $R_i$ (ƙwayoyin CPU, ƙwaƙwalwar ajiya, GPU, software). Kowane albarkatu $p_k$ yana da iyawa $C_k$ da aikin farashi $\text{Cost}(p_k, t)$, wanda zai iya zama na kuɗi ko bisa fifiko/ƙima.

Manufar mai tsara ayyuka ita ce nemo taswira $M: J \rightarrow P$ wanda ke rage jimillar farashi ko lokacin aiki yayin gamsar da ƙuntatawa: $$\text{rage } \sum_{j_i \in J} \text{Cost}(M(j_i), t)$$ $$\text{ƙarƙashin sharuɗɗan } R_i \subseteq C_{M(j_i)} \text{ ga duk } j_i \in J.$$ COBalD yana amfani da dabaru ko dabarun koyon inji don warware wannan matsala mai ƙarfi, kan layi yayin da ayyuka da samuwar albarkatu suka canza.

5. Sakamakon Gwaji & Aikin Samfuri

Takardar ta ba da rahoto game da farkon gogewa tare da aikace-aikacen kimiyya akan samfuran da ake da su. Duk da yake ba a yi cikakken bayani game da lambobin ma'auni a cikin abin da aka ba da shawar ba, nasarar aiwatar da aikace-aikacen al'umma daban-daban ta tabbatar da tsarin gine-gine. Maɓalli na nuna aiki (KPIs) don irin wannan tarayya yawanci sun haɗa da:

  • Ƙarfin Aiki (Job Throughput): Adadin ayyukan da aka kammala a kowace rana a cikin tsarin tarayya.
  • Amfani da Albarkatu: Kashi na lokacin da aka ba da gudummawar albarkatu (musamman na waje, waɗanda za a iya fashewa) ana amfani da su sosai, yana nuna ingancin samar da ƙarfi na COBalD.
  • Ingancin Canja wurin Bayanai: Jinkiri da bandwidth don ayyukan da ke samun bayanai daga tarayyar Storage4PUNCH, mai mahimmanci ga bincike masu nauyin I/O.
  • Gamsuwar Mai Amfani: Rage rikitaccen ƙaddamar da aiki da jiran lokaci, ana auna ta hanyar binciken mai amfani.

Zangon samfuri yana da mahimmanci don gwada matsin lamba na haɗin AAI, ƙarfin rufin HTCondor, da iyawar CVMFS don isar da software zuwa dubban ayyuka a lokaci guda.

6. Tsarin Bincike: Misalin Amfani

Labari: Masanin kimiyyar nukiliya yana buƙatar sarrafa bayanan na'urar gano abubuwa mai girman Petabyte 1 ta amfani da sarkar simintin Monte Carlo mai sarƙaƙi.

  1. Shiga: Masanin binciken ya shiga cikin PUNCH JupyterHub tare da takaddun shaida na cibiyarsa (ta hanyar AAI na alama).
  2. Software: Notbuk ɗinsa yana ɗora kayan aikin software da ake buƙata daga CVMSR kai tsaye kuma ya ƙaddamar da kwantena tare da ɗakunan ajiyar simintin gyare-gyare na musamman.
  3. Bayanai: Lambar notbuk ɗin tana nufin bayanai ta amfani da sunan sarari na tarayyar Storage4PUNCH (misali, `root://punch-federation.de/path/to/data`). Ƙa'idodin XRootD suna sarrafa wuri da canja wuri.
  4. Ƙididdiga: Masanin binciken ya ƙaddamar da ayyuka 10,000 a layi daya ta hanyar murfin Python wanda ke haɗuwa da HTCondor REST API. COBalD/TARDIS yana samar da haɗakar ma'aikatan HTCondor na cikin gida da nodes na girgije na HPC da za a fashe don ɗaukar nauyin ƙarfin kololuwa.
  5. Haɗin kai: HTCondor yana sarrafa rayuwar aikin. Ana sake rubuta sakamako zuwa ajiyar tarayya. Masanin binciken yana lura da ci gaba ta allon JupyterHub.

Wannan labarin yana nuna haɗin kai mara tsangwama wanda tsarin ke nufi, yana ɓoye rikitaccen tsarin kayayyaki.

7. Ayyukan Gaba & Taswirar Ci Gaba

Tsarin PUNCH4NFDI zane ne don tarayyar bincike ta ƙasa.

  • Tarayyar Ƙungiyoyi daban-daban: Ƙirar na iya faɗaɗawa zuwa wasu ƙungiyoyin NFDI (misali, don kimiyyar rayuwa, injiniyanci), ƙirƙirar ginshiƙin Tsarin Bayanan Bincike na Ƙasa na gaskiya. Yarjejeniyar raba AAI da albarkatu tsakanin ƙungiyoyi za su zama mahimmanci.
  • Haɗa Albarkatun Edge & Ƙididdiga: Yayin da ƙididdiga na gefe (don shirya bayanan kayan aiki) da ƙididdiga suka girma, tsarin mai tsara ayyuka zai iya faɗaɗawa don haɗa waɗannan azaman nau'ikan albarkatu na musamman.
  • Ingantaccen Aikin AI/ML: Algorithms na tsara ayyuka na iya haɗa masu hasashen lokutan aikin AI/ML (kama da hanyoyin a cikin ayyuka kamar `Optuna` ko `Ray Tune`) don ƙara inganta sanyawa, musamman don albarkatun GPU.
  • Ƙarfafa Metadata & Tafkunan Bayanai: Haɗin kai mai zurfi na kasidar metadata zai iya canza Storage4PUNCH zuwa tafkin bayanai mai aiki, yana ba da damar tsara ayyukan ƙididdiga inda ake aika ayyukan ƙididdiga zuwa wurin bayanan.
  • Mai da hankali kan Ci gaba mai Dorewa: Siffofi na gaba za su iya inganta don sawun carbon, suna ba da fifikon tsara ayyuka zuwa cibiyoyin bayanai masu haɗakar makamashi mai sabuntawa, daidaitawa da ƙaddamarwar Ƙididdiga mai Kore da ake gani a cikin ayyuka kamar `Yarjejeniyar Kore ta Turai`.

8. Nassoshi

  1. Ƙungiyar PUNCH4NFDI. (2024). "PUNCH4NFDI White Paper." NFDI.
  2. Thain, D., Tannenbaum, T., & Livny, M. (2005). "Distributed computing in practice: the Condor experience." Concurrency and Computation: Practice and Experience, 17(2-4), 323-356. https://doi.org/10.1002/cpe.938
  3. Giffels, M., et al. (2022). "COBalD/TARDIS – Agile resource provisioning for HTCondor pools." Journal of Physics: Conference Series, 2438(1), 012077.
  4. Blomer, J., et al. (2011). "The CERN Virtual Machine File System: A scalable, reliable, and efficient software distribution system." Journal of Physics: Conference Series, 331(5), 052004.
  5. Worldwide LHC Computing Grid (WLCG). "Storage Federation with XRootD and dCache." https://wlcg.web.cern.ch/
  6. Wilkinson, M., et al. (2016). "The FAIR Guiding Principles for scientific data management and stewardship." Scientific Data, 3, 160018. https://doi.org/10.1038/sdata.2016.18

9. Ra'ayin Mai Bincike: Fahimta ta Tsakiya, Tsarin Ma'ana, Ƙarfi & Kurakurai, Shawarwari Masu Aiki

Fahimta ta Tsakiya: PUNCH4NFDI ba ta gina sabon babban kwamfuta ba; tana gina tsarin aiki na tarayya. Haƙiƙanin ƙirƙira shine hanya mai ma'ana, bisa rufi wacce ke lulluɓe da albarkatun cibiyoyi da ake da su, masu rikitarwa, da nau'ikan iri-iri zuwa dandali ɗaya, mai sauƙin amfani. Wannan ba game da ci gaban fasaha ba ne kawai, amma game da haɗin kai na zamantakewa-fasaha a matakin ƙasa. Yana fuskantar kai tsaye "balaraba na gama gari" a cikin ƙididdigar bincike, inda albarkatun ke keɓe kuma ba a amfani da su sosai, ta hanyar ƙirƙirar kasuwa mai sarrafa zagayowar ƙididdiga da bayanai.

Tsarin Ma'ana: Ma'anar tana da ma'ana sosai. 1) Karɓi Bambance-bambance a matsayin Baƙo na Farko: Maimakon tilasta daidaitawa (wanda ba za a iya farawa da siyasa ba), sun ɓoye shi da HTCondor da kwantena. 2) Rage Rikici Mai Bayarwa: Samfurin COBalD/TARDIS hazaka ne—mai tsara ayyuka ne na ƙwayoyin cuta wanda baya buƙatar cibiyoyin HPC su canza manufofinsu na cikin gida, yana sa karɓuwa ya zama mai daɗi. 3) Ƙara Sauƙin Mai Amfani: JupyterHub da token-AAI sune sifofi masu kisa don karɓuwa, suna ɓoye babban rikitaccen abin da ke baya a bayan shafin burauza. 4) Amfani da Amincewar Al'umma: Gina akan kayan aikin HEP da aka gwada (dCache, XRootD, CVMFS) ba kawai yana da inganci a fasaha ba; yana ba da aminci kai tsaye kuma yana rage haɗarin aiki.

Ƙarfi & Kurakurai: Ƙarfinsa shine iyawar sa. Wannan ba fantasy ɗin takardar bincike ba ne; samfuri ne mai aiki ta amfani da abubuwan buɗe ido masu girma. Hangen nesa na ajiyar tarayya, idan an cika shi da metadata, zai iya zama mai canzawa. Duk da haka, kurakurai suna cikin kabu. Ƙarin aiki na matakin mai tsara ayyuka da motsin bayanai na yanki mai faɗi na iya soke fa'idodin don aikace-aikacen HPC masu haɗin kai. Samfurin a zahiri ya fi dacewa don ayyuka masu ƙarfin aiki, marasa haɗin kai. Akwai kuma bam na lokacin mulki: wa zai ba da fifikon ayyuka lokacin da buƙata ta wuce wadatar tarayya? Takardar ta yi watsi da yaƙin siyasa da ba makawa game da algorithms na rabon gaskiya da ƙimar farashi tsakanin cibiyoyi. A ƙarshe, duk da sun ambaci albarkatun "Girgije", tsarin tattalin arziki don fashewa zuwa girgije na kasuwanci (AWS, Google Cloud) tare da kuɗi na gaske, ba kawai ƙima ba, yanki ne da ba a bincika ba cike da haɗarin kasafin kuɗi.

Shawarwari Masu Aiki: 1) Ga sauran ƙungiyoyi: Kwafi wannan zanen nan da nan. Tsarin gine-gine ana iya sake amfani da shi. Fara da AAI da ƙofar aiki mai sauƙi. 2) Ga PUNCH4NFDI kanta: Buga bayanan aiki masu wuya. Dole ne su nuna farashin ƙarin tarayya a bayyane idan aka kwatanta da shiga na asali don gina aminci. 3) Ƙirƙiri manufar rabon gaskiya mai zurfi, mai girma yanzu, kafin rikice-rikice su taso. Haɗa lauyoyi da akawu, ba masana kimiyyar lissafi kawai ba. 4) Bincika haɗin kai tare da masu sarrafa ayyuka (Nextflow, Snakemake). Waɗannan suna zama ma'auni na gaskiya don kimiyya mai maimaitawa; haɗin kai na asali zai zama babban nasara. 5) Yi la'akari da "Samfurin Girma na Tarayya" don shigar da masu ba da albarkatu a hankali, daga sauƙin shiga ayyuka zuwa cikakken tsara bayanai/ƙididdiga. Wannan ba tsarin kayayyaki kawai bane; sabon samfuri ne don tsara ƙarfin bincike na ƙasa. Nasararsa za ta dogara da yadda mulki da amincewar al'umma suke daidai da kyawun lambarsa.