forked from abacusmodeling/abacus-develop
-
Notifications
You must be signed in to change notification settings - Fork 145
First stage of add DSP FFT #5878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 13 commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
c1dac21
set fft_dsp
A-006 4a2894b
add information in map
A-006 51756fd
update Global_rank
A-006 9549092
update control flow
A-006 5d8dfe0
[pre-commit.ci lite] apply automatic fixes
pre-commit-ci-lite[bot] fb1e23a
Merge branch 'develop' into fft10
A-006 862d1cf
Merge branch 'develop' into fft10
A-006 e5d9214
add the fft_dsp in the fft_bundle
A-006 88c25d7
change teh cmake file
A-006 29e5068
Merge branch 'develop' into fft10
A-006 18a64b6
modify back scalapck
A-006 6e151a2
set the dsp ig2ixyz_k_cpu
A-006 813bf05
modify the pw_basis
A-006 50a8366
add the namespace
A-006 f6ca1f1
Merge branch 'develop' into fft10
A-006 7cc5469
remove mutable
A-006 f2ea839
Merge branch 'develop' into fft10
A-006 8d86c06
Merge branch 'develop' into fft10
A-006 8c18170
fix fft_dsp
A-006 b896780
add the convolution and allocate or destroy the b_id
A-006 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,201 @@ | ||
| #include "dsp_connector.h" | ||
| #include <iostream> | ||
| #include <complex> | ||
|
|
||
| extern "C" | ||
| { | ||
| #define complex_double ignore_complex_double | ||
| #include <mt_hthread_blas.h> // MTBLAS_TRANSPOSE etc | ||
| #undef complex_double | ||
| #include <mtblas_interface.h> // gemm | ||
| } | ||
|
|
||
| void dspInitHandle(int id){ | ||
| mt_blas_init(id); | ||
| std::cout << " ** DSP inited on cluster "<< id << " **" << std::endl; | ||
| } // Use this at the beginning of the program to start a dsp cluster | ||
|
|
||
| void dspDestoryHandle(int id){ | ||
| hthread_dev_close(id); | ||
| std::cout << " ** DSP closed on cluster "<< id << " **" << std::endl; | ||
| } // Close dsp cluster at the end | ||
|
|
||
|
|
||
| MTBLAS_TRANSPOSE convertBLASTranspose(const char* blasTrans) { | ||
| switch (blasTrans[0]) { | ||
| case 'N': | ||
| case 'n': | ||
| return MtblasNoTrans; | ||
| case 'T': | ||
| case 't': | ||
| return MtblasTrans; | ||
| case 'C': | ||
| case 'c': | ||
| return MtblasConjTrans; | ||
| default: | ||
| std::cout << "Invalid BLAS transpose parameter!! Use default instead." << std::endl; | ||
| return MtblasNoTrans; | ||
| } | ||
| } // Used to convert normal transpost char to mtblas transpose flag | ||
|
|
||
|
|
||
| void* malloc_ht(size_t bytes, int cluster_id) | ||
| { | ||
| //std::cout << "MALLOC " << cluster_id; | ||
| void* ptr = hthread_malloc((int)cluster_id, bytes, HT_MEM_RW); | ||
| //std::cout << ptr << " SUCCEED" << std::endl;; | ||
| return ptr; | ||
| } | ||
|
|
||
| // Used to replace original malloc | ||
|
|
||
| void free_ht(void* ptr) | ||
| { | ||
| //std::cout << "FREE " << ptr; | ||
| hthread_free(ptr); | ||
| //std::cout << " FREE SUCCEED" << std::endl; | ||
| } | ||
|
|
||
| // Used to replace original free | ||
|
|
||
| void sgemm_mt_(const char *transa, const char *transb, | ||
| const int *m, const int *n, const int *k, | ||
| const float *alpha, const float *a, const int *lda, | ||
| const float *b, const int *ldb, const float *beta, | ||
| float *c, const int *ldc, int cluster_id) | ||
| { | ||
| mtblas_sgemm(MTBLAS_ORDER::MtblasColMajor, | ||
| convertBLASTranspose(transa),convertBLASTranspose(transb), | ||
| *m,*n,*k, | ||
| *alpha, a, *lda, | ||
| b, *ldb, *beta, | ||
| c, *ldc, cluster_id | ||
| ); | ||
| } // zgemm that needn't malloc_ht or free_ht | ||
|
|
||
| void dgemm_mt_(const char *transa, const char *transb, | ||
| const int *m, const int *n, const int *k, | ||
| const double *alpha, const double *a, const int *lda, | ||
| const double *b, const int *ldb, const double *beta, | ||
| double *c, const int *ldc, int cluster_id) | ||
| { | ||
| mtblas_dgemm(MTBLAS_ORDER::MtblasColMajor, | ||
| convertBLASTranspose(transa),convertBLASTranspose(transb), | ||
| *m,*n,*k, | ||
| *alpha, a, *lda, | ||
| b, *ldb, *beta, | ||
| c, *ldc, cluster_id | ||
| ); | ||
| } // cgemm that needn't malloc_ht or free_ht | ||
|
|
||
| void zgemm_mt_(const char *transa, const char *transb, | ||
| const int *m, const int *n, const int *k, | ||
| const std::complex<double> *alpha, const std::complex<double> *a, const int *lda, | ||
| const std::complex<double> *b, const int *ldb, const std::complex<double> *beta, | ||
| std::complex<double> *c, const int *ldc, int cluster_id) | ||
| { | ||
| mtblas_zgemm(MTBLAS_ORDER::MtblasColMajor, | ||
| convertBLASTranspose(transa),convertBLASTranspose(transb), | ||
| *m,*n,*k, | ||
| (const void*)alpha, (const void*)a, *lda, | ||
| (const void*)b, *ldb, (const void*)beta, | ||
| (void*)c, *ldc, cluster_id | ||
| ); | ||
| } // zgemm that needn't malloc_ht or free_ht | ||
|
|
||
| void cgemm_mt_(const char *transa, const char *transb, | ||
| const int *m, const int *n, const int *k, | ||
| const std::complex<float> *alpha, const std::complex<float> *a, const int *lda, | ||
| const std::complex<float> *b, const int *ldb, const std::complex<float> *beta, | ||
| std::complex<float> *c, const int *ldc, int cluster_id) | ||
| { | ||
| mtblas_cgemm(MTBLAS_ORDER::MtblasColMajor, | ||
| convertBLASTranspose(transa),convertBLASTranspose(transb), | ||
| *m,*n,*k, | ||
| (const void*)alpha, (const void*)a, *lda, | ||
| (const void*)b, *ldb, (const void*)beta, | ||
| (void*)c, *ldc, cluster_id | ||
| ); | ||
| } // cgemm that needn't malloc_ht or free_ht | ||
|
|
||
| // Used to replace original free | ||
|
|
||
| void sgemm_mth_(const char *transa, const char *transb, | ||
| const int *m, const int *n, const int *k, | ||
| const float *alpha, const float *a, const int *lda, | ||
| const float *b, const int *ldb, const float *beta, | ||
| float *c, const int *ldc, int cluster_id) | ||
| { | ||
| mt_hthread_sgemm(MTBLAS_ORDER::MtblasColMajor, | ||
| convertBLASTranspose(transa),convertBLASTranspose(transb), | ||
| *m,*n,*k, | ||
| *alpha, a, *lda, | ||
| b, *ldb, *beta, | ||
| c, *ldc, cluster_id | ||
| ); | ||
| } // zgemm that needn't malloc_ht or free_ht | ||
|
|
||
| void dgemm_mth_(const char *transa, const char *transb, | ||
| const int *m, const int *n, const int *k, | ||
| const double *alpha, const double *a, const int *lda, | ||
| const double *b, const int *ldb, const double *beta, | ||
| double *c, const int *ldc, int cluster_id) | ||
| { | ||
| mt_hthread_dgemm(MTBLAS_ORDER::MtblasColMajor, | ||
| convertBLASTranspose(transa),convertBLASTranspose(transb), | ||
| *m,*n,*k, | ||
| *alpha, a, *lda, | ||
| b, *ldb, *beta, | ||
| c, *ldc, cluster_id | ||
| ); | ||
| } // cgemm that needn't malloc_ht or free_ht | ||
|
|
||
| void zgemm_mth_(const char *transa, const char *transb, | ||
| const int *m, const int *n, const int *k, | ||
| const std::complex<double> *alpha, | ||
| const std::complex<double> *a, | ||
| const int *lda, | ||
| const std::complex<double> *b, | ||
| const int *ldb, | ||
| const std::complex<double> *beta, | ||
| std::complex<double> *c, | ||
| const int *ldc, | ||
| int cluster_id) | ||
| { | ||
| std::complex<double>* alp = (std::complex<double>*) malloc_ht(sizeof(std::complex<double>), cluster_id); | ||
| *alp = *alpha; | ||
| std::complex<double>* bet = (std::complex<double>*) malloc_ht(sizeof(std::complex<double>), cluster_id); | ||
| *bet = *beta; | ||
| mt_hthread_zgemm(MTBLAS_ORDER::MtblasColMajor, | ||
| convertBLASTranspose(transa),convertBLASTranspose(transb), | ||
| *m,*n,*k, | ||
| alp, a, *lda, | ||
| b, *ldb, bet, | ||
| c, *ldc, cluster_id | ||
| ); | ||
|
|
||
|
|
||
| } // zgemm that needn't malloc_ht or free_ht | ||
|
|
||
| void cgemm_mth_(const char *transa, const char *transb, | ||
| const int *m, const int *n, const int *k, | ||
| const std::complex<float> *alpha, const std::complex<float> *a, const int *lda, | ||
| const std::complex<float> *b, const int *ldb, const std::complex<float> *beta, | ||
| std::complex<float> *c, const int *ldc, int cluster_id) | ||
| { | ||
| std::complex<float>* alp = (std::complex<float>*) malloc_ht(sizeof(std::complex<float>), cluster_id); | ||
| *alp = *alpha; | ||
| std::complex<float>* bet = (std::complex<float>*) malloc_ht(sizeof(std::complex<float>), cluster_id); | ||
| *bet = *beta; | ||
|
|
||
| mt_hthread_cgemm(MTBLAS_ORDER::MtblasColMajor, | ||
| convertBLASTranspose(transa),convertBLASTranspose(transb), | ||
| *m,*n,*k, | ||
| (const void*)alp, (const void*)a, *lda, | ||
| (const void*)b, *ldb, (const void*)bet, | ||
| (void*)c, *ldc, cluster_id | ||
| ); | ||
|
|
||
| free_ht(alp); | ||
| free_ht(bet); | ||
| } // cgemm that needn't malloc_ht or free_ht | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.